{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Section 20: Network-Driven Supervised Machine Learning\n",
    "\n",
    "## 20.1 The Basics of Supervised Machine Learning\n",
    "\n",
    "We want to construct a model that maps inputted iris measurements to one of three species categories. In machine learning, these inputted measurements are called **features**. Meanwhile, the outputted categories are called **classes**. The goal of supervised learning is to construct a model that can identify classes based on features. Such a model is called a **classifier**.\n",
    "\n",
    "As seen in Section Fourteen, we can load known iris features and class labels using Scikit-Learn’s `load_iris` function. \n",
    "\n",
    "**Listing 20. 1. Loading iris features and class labels**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "We have 150 labeled examples across the following 3 classes:\n",
      "{0, 1, 2}\n",
      "\n",
      "First four feature rows:\n",
      "[[5.1 3.5 1.4 0.2]\n",
      " [4.9 3.  1.4 0.2]\n",
      " [4.7 3.2 1.3 0.2]\n",
      " [4.6 3.1 1.5 0.2]]\n",
      "\n",
      "First four labels:\n",
      "[0 0 0 0]\n"
     ]
    }
   ],
   "source": [
    "from sklearn.datasets import load_iris\n",
    "X, y = load_iris(return_X_y=True)\n",
    "num_classes = len(set(y))\n",
    "print(f\"We have {y.size} labeled examples across the following \"\n",
    "      f\"{num_classes} classes:\\n{set(y)}\\n\")\n",
    "print(f\"First four feature rows:\\n{X[:4]}\")\n",
    "print(f\"\\nFirst four labels:\\n{y[:4]}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All 150 flower measurements have been labeled as belonging to one of three flower species. Imagine that we only have the resources to label one-fourth of the flowers. We'd label the data then train a model to predict the classes of the remaining flowers. Lets simulate this scenario.\n",
    "\n",
    "**Listing 20. 2. Creating a training set**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training set labels:\n",
      "[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]\n"
     ]
    }
   ],
   "source": [
    "sampling_size = int(y.size / 4)\n",
    "X_train, y_train = X[:sampling_size], y[:sampling_size]\n",
    "print(f\"Training set labels:\\n{y_train}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our training set contains just the labeled examples with _Species 0_. The remaining two flower species are not represented. In order to increase representation, we should sample at random from `X` and `y`.\n",
    "\n",
    "**Listing 20. 3. Creating a training set through random sampling**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training set labels:\n",
      "[0 2 1 2 1 0 2 0 2 0 0 2 0 2 1 1 1 2 2 1 1 0 1 2 2 0 1 1 1 1 0 0 0 2 1 2 0]\n"
     ]
    }
   ],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "import numpy as np\n",
    "np.random.seed(0)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.25)\n",
    "print(f\"Training set labels:\\n{y_train}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we saw in Section Fourteen, the features within the iris dataset can be plotted in multi-dimensional space. This plotted data forms spatial clusters. Hence, elements in `X_test` are more likely to share their class with the `X_train` points found in the adjacent cluster. \n",
    "\n",
    "**Listing 20. 4. Plotting the training and test sets**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD4CAYAAADrRI2NAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAswElEQVR4nO3dfXRb5Z0n8O/PxsE2SQh5bRy/KAcYBsJbmjRTFs6eYSbQhAxvoWEB0TO73aLkFIbuTjelricZXqpN5qS7p01LSV0mtAUNOcxAm840hZApbJhpmZBMQ8kLMw5gO0p6SAiTF9dy7Fi//UOSLcv3Xl1JV7r36n4/5+TEur6SHsvSz8/9Pc/ze0RVQUREwVHjdgOIiKiyGPiJiAKGgZ+IKGAY+ImIAoaBn4goYM5zuwFWpk+frqFQyO1mEBH5xp49ez5S1RlW53g68IdCIezevdvtZhAR+YaI9OQ7h6keIqKAYeAnIgoYBn4iooBh4CciChgGfiKigGHgr6DkcB/OHHsayeHfud0UIgowBv4KOtv3JpLDp3C271duN4WIAoyBv0KSw30Y7N8HABjs389ePxG5hoG/Qs72vZl1S9nrJyLXMPBXwGhvfzh9ZJi9fiJyDQN/BYzt7Wew109E7mDgr4Chgfcw2tvPGE4fJyI/icWAUAioqUn9H4u53aLCebpIW7WYPGul200gIgfEYkAkAvT3p2739KRuA0A47F67CsUePxGRTR0do0E/o78/ddxPGPiJiGzq7S3suFcx8BMR2dTaWthxr2LgJyKyKRoFGhvHHmtsTB33Ewb+YlXD0D4RFSQcBjo7gbY2QCT1f2envwZ2Ac7qKU61DO0TUcHCYf9/zNnjL0a1DO0TUSAx8BejWob2iSiQGPiLUS1D+0QUSAz8xaiWoX0iCiQG/mLkDO1rawsS37gXyXvucLtlRER5MfAXKxwGuruBZBIDv9mMwTsuZ7VNIvIFBv4ScWctIvIbBv4ScWctIjLj1XWejgR+EdksIsdEZJ/J90VENorIIRH5jYh80onndRt31iIiM5l1nj09gOroOk8vBH+nevw/ALDE4vtLAVya/hcB8JRDz+sq7qxFRGa8vM7TkcCvqjsBfGxxyu0AfqQpbwKYIiKznXhuN3FnLSIy4+V1npWq1TMHwOGs2/H0sd/mnigiEaSuCtDq8QVR3FmLiMy0tqbSO0bH3VapwV0xOKZGJ6pqp6ouVNWFM2bMKHOziIjKw8vrPCsV+OMAWrJuNwM4WqHnJiKqOC+XcK5UquenAB4SkS0A/gDAKVUdl+YhIqomXi3h7EjgF5HnAfwhgOkiEgfwlwDqAEBVNwHYBuAWAIcA9AP4b048LxERFc6RwK+q9+b5vgJ40InnIiKi0nDlLhFRwDDwExEFDAM/EVHAMPATEQUMAz8RUcAw8HtEcrgPZ449zcqeRFR2DPwecbbvTSSHT7GyJxGVHQN/sRzcYYG7eBFRJTHwF8PhHRa4ixcRVRIDfzEc3GGBu3gRUaUx8BfDwR0WuIsXEVUaA38xTHZS0Jbmgh+Ku3gRUaVVqixzdYlGUzn9rHSPNtRhaE0YEwp8KO7iRUSVxh5/McJhJDd9G8k5F0EFSM6ZgsSGzyKx7BPMzROR57HHX6Szt7VicPFajE3TpHLzDRcutvUYyeE+/O7EFlww7V7U1F5QlnYSEeVij79ITuTmuWiLiNzAHn+RSs3N5y7aOn/idez1E1FFsMfvEi7aIiK3MPC7gIu2iMhNDPwu4KItIvcFuSIuA78LuGiLyH1BnlzBwd0yyDdNk4u2iNwV9MkV7PGXQZB7EkR+EPTJFQz8DmNtfSJv4+QKBn7HBb0nQd4T5EFMI5xcwcBfmpxduJLPbg58T4K8h6nHsTi5goO7xcvswpWp0NnTA1n5RdRtWI6h5QuyTiysfg+Rk4I+iGmEkyvY4y+ewS5ckjiL+nU/zzkxWD0J8hamHotT7ekxR3r8IrIEwLcA1AJ4WlXX53z/DwFsBfBB+tBLqvq4E8/tGpPdtuToSQDnYdLMLwS+Z0XuMhvEZK8/v+z0WDVerZfc4xeRWgBPAlgK4AoA94rIFQanvqGq16b/+TvoA+a7cDVNAXtW5AUcxCxOEGbmOZHqWQTgkKq+r6qDALYAuN2Bx/W2aBRobBxzSAUYWnw5OKhLXsBBzOIEIT3mRKpnDoDDWbfjAP7A4LzrRORtAEcB/C9V3W/0YCISARABgFaTXrUnhMPAP/8zsGkToAoAEAUmvLAbw58KYWj5oqq9TCR/4CBm4YKSHnOixy8GxzTn9r8CaFPVawB8G8BPzB5MVTtVdaGqLpwxY4YDzSujbdtGgn6GJIbSA7zsWRH5TVDSY04E/jiAlqzbzUj16keo6mlV7Ut/vQ1AnYhMd+C53WUxwDtp5ir2uIh8JijpMSdSPW8BuFRE5gI4AuAeAPdlnyAinwDwoaqqiCxC6g/OCQee212trUBPz7jD2nQR0zxEPhSUzlrJPX5VPQfgIQCvADgI4AVV3S8iq0RkVfq0zwLYl87xbwRwj6rmpoP8x2iAt6EOA+1LOLhLVEHVPu/eaeLl+Ltw4ULdvXu3282wFosh2f7nkPgxaNMUDLQvTa/crcWExitt9frzlXEmImuJUzsw2P82JjReE/grbRHZo6oLrc7hyt1ShcPoe+txJL59LwCg4c+ex6RPfR11L+2ynRdkLRWi4nlp3r1frjwY+B0wecdENH5lK2qOnIQoUHPkJBq/shWTd0zMe18vvWkpmPwSrMx4ad69XzpxDPxOMKjbg/7+1PGcCp6Ixcac5qU3LQWTX4KVES/V1vdTJ46B3wkm0zrR05Oq4NnTk5rvn76dfHYzzhx7GucGj3nmTUveFIvFEAqFUFNTg1AohFhOx6FUfgpWRrw0795PnTgGfieYrTCurTW5EmhHcvgUEie3GdzJ228YqpxYLIZIJIKenh6oKnp6ehCJRBwN/n4KVka8Mu/eS1cednBWjxNiMWjkAUh/YvRYY+P4oJ+mApw+8g3Th5OaiYGZT0zmQqEQegzWibS1taG7u7vkx8/k9scGTm9VlvXLjLfUrKLswA8UMrPPSXZm9XAjFieEwxhK7MN5j26CHD0FaW1NzfHv6DBd4JXizhuD/KHXJIVodrxQVmkSr7wn/VIe2erKw4vtZuAvgFnvIznch8SyWcCyv8C4HlP2Ll0YXeCVUp0FoMgZra2thj1+p4oXej1Y+Wn3MLMr9Fgs1f/r7U1lhKPRVH1HtzHwFyDT+xg4sxPDg0dG/gAY5UkbLlw8+htO/+aTzTMw8MhNGFo+3/h8oizRaBSRSAT9WR2HxsZGRKPRvPe1kyLxejrR9HPlEwa7syISSX3tdvDn4K5N2b2PocRBJIdP4cyxv8bpD7+Hwf53YDqoEw4D3d1AMom+tx7PCfqp86utABQ5IxwOo7OzE21tbRARtLW1obOzE2EbUcOtKZpOrQnw22CpEatZ3m7j4K5NxoM3Zpi7J/eMHbSt7GCtU6UTvDRYWqyamnFV2wEAIkAyWb7nZckGBySH+3D6w84Cgj7AXjy5ya0pmk6uCfDKNM1SmA3FeGF/Keb48zjb9yY0eSbveSwORV7g5g5SdnLydqdn5o4/ZN/PL6LRcXM70NiYOu429vgtZPdg8hns3+er/CP5Q6E5c7dWstrNyRc79uDEmEWe6imOC4eBzk6grS2V3mlrS912e2AXYOC3lPoQ2R0DGfbdqkfyvkzAO3Nss63g71aKxM4fnHypILM/cnZSSPn+QGZm2ORUT8Fzzw6UtUBd1twOdHd7I+gDTPWYGn2z5Y7CnAfg3Mitupf2oH7dzyFHT0LnTAXWb/TOb5d8bewV5yAGzuxE45Sllvdxa4qmnTUB+VJBZou17KSQ8i30Mpth87UOxa2Lvb9AzGkM/CbMe/vDmDRzVSo/GYsBX9k68o6S+MfemahLvpf7HhxKHERy0n8et3jQCyUN8v3ByTf2YLZYy86YhZ2FXmaLnePxesv7VSumekykejBGc66yLl+9PFGXfM34ilMxcGbnmPP8UlI5XyrIbCaScQdsbArJziwms5k0zU2ZiRv+K1BXCgZ+E5NnrYTUGG+kMpIvNetG5Bz3+0YXVHlmV5ypxYOp95GfSipbpYKsBoaNO2CjYxZ2B5UNtsdGQ8MQ1ra/YXm/asVUj4W8+dLWVsMibLndC78UmiLvyHfF2XDhYsdLGsRiMXR0dKC3txetra2IRqO2VgnbYfVZSpzaYXA09fNMnB62XIw2cOYNjP+DMv61yKmegubmBNY+8jpWLH/X8n7Vij3+EiSfWAttmDD2YM5EXT/1ysg78l1xFlPSwOrK06r2f7mvWK2uBvKlcYYS/27wiMazmLJn2Ox760dYsfyArftVo6ot2VCJQa/EqR3Q5zaj/q/+ETXx44bl98YuPffXknPyplgshvb2P0c8fgzNTVOwtn0pVixfgHzvL6tyCla1/w++/bQjZRgKlW+/ADdLU3hZoEs2lHvQK9PjGlq+AGf+5WtIDp0ZM1HXuNRDsPKIVLj889FTPfPDh49BFTh85CQeXv13+NuX9sCqx5rvytOq9r9bV6zFDghTflUZ+CuRXsn3phst9WCcfyQykq/D0tHRMaZMMwAkEkP4+oZf4sLZXzbNped7v5rV+G9unmF5v2LZSR8VOyDsZ5VaXVyVgb9sPYH0b0VranD+vLtR99Ku9DfGvumsSz0EJ49IhbHTYSlmVy47QTIajaIxZ9pLY2MD1j5yk+X9imXninzyrJW4cPaXx/2bPGulpzZZd4rZ6uJyBP+qC/xl6wlk/VZEFTVH/gMNq/8OdS/tSZ9gdglaiwmN14x542ZmKvi9d0LOsjcf3bhnbrUrl50gaVT7/zvf+hJWLM9NFZceXJ24Iq+G6p25KrksqOoCf9l6Aga/FUkMoX7dz9O37F+C+mXRDZVfJuVxbvCYzfnoRj1z61257AbJcDiM7u5uJJNJdHd3465bQ7bul81OqsLqD5zdVIfV1YBf2VwW5Iiqm8fvxD6ihjOCTF79mqOncOHsL4/ctpqT3HDhYl/tI0rll+kEJE5uM/iu0Xz01OSBQubbFxsMC72fna0GrUowPL/lAs9uVVgJNpcFOcKR6ZwisgTAtwDUAnhaVdfnfF/S378FQD+A/6qq/5rvcd3agctw2lsoZPxbaWtLzeZJO/3h96DJvnGnSc1ETJ61ktM7aYTxdMWxMu8bP7DzEXnm6Xfwl4+2IX50EpqbTmNt+z9hxfIuTGi8Epdfs9jOR6xq5f7hBFLLggot5WxnOmfJPX4RqQXwJICbAMQBvCUiP1XV7NURSwFcmv73BwCeSv/vOaY9cpu7Klh9SN3cJIO8Z/xYkL87AflSFbEY8ODDv49Eog4AcPjIhXh49c0AgLs/+x56e41/9nKkOrwod3WxwbIgxziR418E4JCqvq+qgwC2ALg955zbAfxIU94EMEVEZjvw3I4zzT+a7KqQvOd22wO11TgTgYpTbdMRYzGgpsY4e5BJVXR0YCToZyQSdfj6hmWYPGulp7cqrJRK1e93IvDPAXA463Y8fazQcwAAIhIRkd0isvv48eMONM++vB9Gg99KIQO11TgTgYpTTZ2ATIpieFjGfS/7ojjfFYFRITWvbFVYbZwY3B3/2x5fVtDOOamDqp0AOoFUjr+0phXG6sNodAle6ECtX3K1VH5OTEIAvFGP32gaIgDU1o7NT+cbvKxkqiPonAj8cQAtWbebARwt4hzXFfphdLo6IgWHU50ANyq/5v6xMevJJ5Njg7adYbJwmIG+EpwI/G8BuFRE5gI4AuAeAPflnPNTAA+JyBakBnVPqepvHXhuRxXyYSxmoNYLvTOqHm5NDc79Y9PSkkRv7/iscUtLEtnZZPbovaPkHL+qngPwEIBXABwE8IKq7heRVSKyKn3aNgDvAzgE4PsAvljq87qtmBwtF26Rk9woUma06vbRNfvR0DA05ryGhiE82vHOuPt7dfPxoHFk5a6qblPV31PVi1U1mj62SVU3pb9WVX0w/f2rVLXyk/MdVuhALevyU4YT9e3dmhVk9Mfmrlt/iY0btqNlzimIKFrmnMLGDdux/Lb/V9a2UPGqbuVupRSao+V4AGU4kZcvdCKCE8z+2Eya+QV8fpVixfKcxWh6HpLDv2Na04OqrlaPk5zaeaja5mxT8Zy68nNjarDVH5tqmp4aBOzxW3BqxoQbvTPyJqeu/NyYGmz+x6YLmjxr+D2uTPcm9vhNmPbMitgpgQu3CCj8yq/ce90WyqwiZl39pRb3Kk+v32uvjd+wx2/CsGf2Dx+OKz+oD3wemjyLms993vSxuHCLgMKv/NyYo18M445NRuGL0uzwy2vjVQz8OZLDfej76G+gyX7k9szqv/Z/IONq8g9CO9oBi8BP1auQtRmFLBD0U/nucnRsrF5XP702XsXAn2N0r9xcChyOG95H4sdSsxe2/ISrUwKmkJ5nIQEy6LPArF7XoL82TmCOP0u+vXJ1zkWG39Gmi3Duma9XbsNM8oR8M3SKzUMXMhYQi8UQCoVQU1ODUCiEWBW836xeV86QcwYDf5Z8e+Xif2+ANkwYcx9tqMNA+xKc9+j3KrdhJnlCvpWzxa7Utjs1MhaLIRKJoKenB6qKnp4eRCIR3wd/q9eV00adwcCfZmuv3NtakdhwF5LN06AAtFaA9L67cuQ/jB84KLtIBEy+90sp8/Wtpk1mX0F0dHSgP6ez0d/fj46szobfZr/ke105Q84ZzPGn5ZtxMfKGXL4AANCw+u8giVR9EjlyEiowLjQdpF0kAiTf+6WUPLTZWEBmS9DMY/WadCqyj/tt9ku+15Uz5JzBHn9adk+i7qU9mPSpr2PynP+J86/8L0AsNuYNWb/u5ZGgnyGKVPDPxl0kqpZVz7MceWijK4hWk05F5rgf60OxR18Z7PGnjfQkYjHgK1tH8vUS/zg1V3/DCmD5/NSxoyZpHQW0tQVyOM5ZPVXOqueZOLXD4Ki9Xr/ZNEajK4hoNIpIJDIm3dPY2IhourOR76rDi2XC2aOvDPb4cxlsJySJQdSv+9nIbW2aYnhXnXMRBn6zmTVnA66UXqvRgLDZFcS999yBzs5OtLW1AbgPtbWH0d/fh46OMJ57diD/mBXLhAcWe/y5TPKmcvTkyNcD7UvH5PiB0dk9Q1xQEnjF9lqNFiYBijPHnsH4AaRUDz4cDgMI5y4oR2RlHTZuuAwrlh8Yd58xY1bgIqggYo8/l0neVJsuQk3tNACA3P95yPefQbJlJlSA5JwpSGz4LIaWLwCnlgWH0zNmjFIzqWODAJI5Z49eQRjteZtI1OLxdf/J9D7l3MSliHJWVGHs8ecy2Bg005tPDp8AkO4h3fMF9C3ugyb7ch6gPLVJyHucnDFjnM7Zh9Ge/nmYNPMLhr1ysxnD8aOTAQATGq8Zl9svdNtQu2KxceWsEImkvmbm0zsY+HOl353J9j+HxI9Bm6ZgoH1pujefkeohcSAquJxOlRhPY8weJzAfHG5tTQXYXM1NZwzbN3D6DYwfg3Cm9IHR1UdmHSMDv3cw1WMkHEbfW4/j9JFv4Mxbf5ET9AEuE/eXcixicjpVYl3hErB6z0WjqZnD2RoahrG2/Z8N2zd0tsvw8QuZMmn2mppdfWSO+21BWbVi4DeRXXt8QuM1Bmcwl+8XTsxeya6J09bWithzP4CT8/Rza92n3nO1OWcZv+fCYaCzE2hrA0SA1tYkNm7YnjWwO9q+5HAfoJnxgvMwaeaqkecstIic0Wtqtl4xczxzv4HTO/kHwEWBDvx2ex+DidJ7SOQOJxYx5dbE6e09jIdXv4C/fWlP1lnOdgQKnRIaDqdmECeTwLu/+QVWLP+3nDOMtkgsrs1Wr6nR1UdmHWP2/YYGDnIqqYsCneO3Ozg3oeHSrMGwWkxovJKDtz7hRAlfo5o4icQQHl/3c6wYSQMOYzDR5dj7opTxI7M/GoOJLkCzt0gsblDX6jXN5PGNqpMnTo29H2A8PuLFhWXVJrCB3+7gXDlnQFB5OfW7M6uJEz96ChfO/vJIDZ0JDVZbEFaOda2f3LLjhf0xtPOahsPjB3LH32/kO+Oe32/1hfwosKkeu5e8LAPrX8X+7nJr3E+dOtXwvNbWVl/Vw3GiDk6xr6nx/QAg6VhVU7IvkIG/kCJaLBrlX8X87oxq3J8+fRq1tWMHWuvq6hCNRsu6EMppk2etxKSZKzE6aJwa3C0krVTs58F61tLo6+an19PPRNWolrA3LFy4UHfv3u34445e8ma/EZm7JyAUCqHHaFK8galTp+KvHl+KFenifSnmC628YOx7v/Lv+dMffs9g0SMgNRMxcXoYZ449jbGfS2+/nl4kIntUdaHVOYHs8bMXX32c2oLQLJ9v5OOPPy5qdk8l57KXexpqoXKnrWZPJWVatXICObjLFbfVJZOeycy8yWxBCCBdxMy+1tZW2z1+wHh2T76SHZUavMx9XTLTUIFkVnu9s1m5VYfMC+2rJoFM9VB1MUvPtLW1obu7u6DHyg2WACAisPqciAiSydwiasYyvf1UgCtvGsPsdWmZMwXvvPUXI7elZiI7Q1Wk7KkeEZkqIq+KSFf6/4tMzusWkXdEZK+IMJKTo+xsQWhXOBweqXEvImhra8OqVavQmLsqKYvZTlhGrAYvnUpXZeSbhlrMil2qDqXm+L8K4B9V9VIA/5i+beZGVb02318iokLl24KwUOFwGN3d3Ugmk+ju7sZ3v/tdfG/TtzF16vieefaOV/lYzSYzmk0UiURKCv6FvC6soRMspQb+2wH8MP31DwHcUeLjVRTf7NUhGo2O65EXEpDtuOu2Vry/7zFs/v5Xx1wNdHZ22h5HSPX2jTdUMVod3N/fj46OjqLbXMjrwt24gqXUwD9LVX8LAOn/Z5qcpwC2i8geEYlYPaCIRERkt4jsPn78eFGNshvQ+WavDkbpmUICcj7Zi4qWL/sE3n9v/8jVQCHPkRq8NN5Qxcl0VYbR6/K9Td/GbTclxnw2uGgqePIO7orIDgCfMPhWB4AfquqUrHP/Q1XH5flFpElVj4rITACvAvgzVd2Zr3HFDu6OLKHP2YAiWyUH2cjfnJr7bvWec3KA2orRZ8Ptuf3kLEcGd1V1sapeafBvK4APRWR2+slmAzhm8hhH0/8fA/BjAIsK/WHsstt74QpBsqOQVd75WL3nKpGuMvpsOPnzkX+Umur5KYA/TX/9pwC25p4gIheIyKTM1wBuBpBbKcoxdgI63+xkl1OLis4NfojB/rdh9p4rd7oKsNrTNxc7QtWu1MC/HsBNItIF4Kb0bYhIk4hsS58zC8A/icjbAHYB+Jmqvlzi8xqyG9D5Zie7nFrlnTj5c4OjY99zubOJnAz6Zp+NoYFD4Cr24Clp5a6qngDwxwbHjwK4Jf31+wCMtrBynFVAz85ZcoUg2ZVvjrud2vHJ4T4kh08YfKdy7zmzz0Zd/SV8zwdQVZVssBvQzT7MmcE3bgBBdtkpv5AKurVwc/DUyc5OJTdKGRoaQjwex8DAQFmfx4/q6+vR3NyMurq6gu9bVYG/1BWI3ACCCmFnMx+vbOTj5OrcSn5O4vE4Jk2ahFAoBBEp63P5iarixIkTiMfjmDt3bsH3D2R1TiOcy0yFsjORoNrGkyr9ORkYGMC0adMY9HOICKZNm1b0lRADfxqnd5IdmXTgucFjtiYSVFsJcDc+Jwz6xkp5Xaoq1VMsr1yOk/dl0hyJk9sMvjt+IoETKZZKbz5u9nz8nFQP9vhRfZfjVDg7ZT6y0xypWTqV6clXurSI2fMF8XNy8uRJfPe73y3qvt/85jfH1V/yCgZ+VN/lOBXOTnAdG/hqMaHxGsOdpJzkdE49FgPa2pKoqVG0tSWRW/zT6vn88jlxsvhitQZ+pnrAHbmCzsuzc4xy6sXOpInFgEgE6O9P9fd6ewXpjcqQWStm9Xx++Zw4Oevoq1/9Kt577z1ce+21uOmmmzBz5ky88MILOHv2LO6880489thj+N3vfoe7774b8Xgcw8PDWLNmDT788EMcPXoUN954I6ZPn47XXnvNoZ/OGezxU+B5dXZOMaVFrHq7HR1Abge0vz91vNjn8xqnr5DWr1+Piy++GHv37sVNN92Erq4u7Nq1C3v37sWePXuwc+dOvPzyy2hqasLbb7+Nffv2YcmSJXj44YfR1NSE1157zXNBH2Dgp4CzG+zcSHMU88fGKmVlVuE5c7wacvjlnHW0fft2bN++HfPnz8cnP/lJvPvuu+jq6sJVV12FHTt24JFHHsEbb7yBCy+80LHnLBemeijQ7Jb5cCPNUehq23wpq5aWJHp7x/f1WlqSAGp8X8qk3Ok4VUV7eztWrhz/XtizZw+2bduG9vZ23HzzzVi7dm3Jz1dODPwUaE4HOyenXhb6xybfeMCja/bjwYd/H4nE6BL/hoYhPLrmXQBX+SaHb8buH/FCTJo0CWfOnAEAfOYzn8GaNWsQDocxceJEHDlyBHV1dTh37hymTp2K+++/HxMnTsQPfvCDMfedPn16kT9R+TDwU9WKxWLo6OhAb28vWltbEY1Gx1W8tAp2xQRxt8p+2Ont3nXrLzF8thePr7sB8aOT0dx0Gmvb/wl33RoHcFXF2lou5bhimTZtGq6//npceeWVWLp0Ke677z5cd911AICJEyfiueeew6FDh7B69WrU1NSgrq4OTz31FAAgEolg6dKlmD17tufy/Hl34HJTsTtwEWU2L8+eTtfY2FhQjXs7O7llc3NXt7G7aGX4fzetgwcP4vLLL3e7GZ5l9Po4sgMXkR9ZbV5e6GItu7ND3Cz74Zc59uQNTPVQVbLavDw7HXP+xE8bpnMKnT/vdjkDv+fnqbLY46eyc3IlpV2tra2Gx1tamsf05AfOvDFu+mMx89mrYSokBQcDP5VdpWvNAOablz+6Jju/n8RQ4gCAsemcYoI4Uy3kJ0z1UFnZKYdQDpkB3OxZPV9/Yi1uXXwaowE6mXWP0XROMbNDmGohP2Hgp7JystZMocLh8JgZPKMzX4yM5uQZxKnaMdVDZeO12i/GPfls1ZeTd2N8pZqUUp3zlltuwcmTJ4t+7nXr1uGSSy7BZZddhldeeaXoxzHCwE9l47UBz8mzVo6UT5aaiQZnVF9OPjO+MnBmZzD+AMRiQCgE1NSk/s+tO10gq8A/PGzViQC2bduGKVOmFPW8Bw4cwJYtW7B//368/PLL+OIXv5j3+QrBVA+VjZdrvwQhnZM9vjKUOIhKp9oqbrTudOp2Tw/G1Z0uUG5Z5mXLluGxxx7D7NmzsXfvXhw4cAB33HEHDh8+jIGBAXzpS19CJP2coVAIu3fvRl9fH5YuXYobbrgBv/zlLzFnzhxs3boVDQ0Nps+7detW3HPPPTj//PMxd+5cXHLJJdi1a9fIquFSMfBT2VRDcK30todOyh1fASo7wF5xVnWniwz869evx759+7B3714AwOuvv45du3Zh3759mDt3LgBg8+bNmDp1KhKJBD71qU/hrrvuwrRp08Y8TldXF55//nl8//vfx913340XX3wR999/PzZt2gQAWLVq1Zjzjxw5gk9/+tMjt5ubm3HkyJGifgYjDPxEFtyqvVOq8eMrI9/x3c9iW7660w5ZtGjRSNAHgI0bN+LHP/4xAODw4cPo6uoaF/jnzp2La6+9FgCwYMECdHd3Axgf8DOMSuk4uek8c/xEJpze1KOSjMdXACDpu5/FNpNFe6bHi3TBBaNXS6+//jp27NiBX/3qV3j77bcxf/58DAwMjLvP+eefP/J1bW0tzp07Z/kczc3NOHz48MjteDyOpqYmB1qfwsBPZMLN2julsp7B5K+fxbZoFMhZtIfGxtTxImWXZTZy6tQpXHTRRWhsbMS7776LN980+4NbmNtuuw1btmzB2bNn8cEHH6CrqwuLFi1y5LEBBn4iQ16bilqozAymoMxeApDK43d2Am1tgEjq/87OovP7wNiyzKtXrx73/SVLluDcuXO4+uqrsWbNmjF5eTs2bdo0kufPNm/ePNx999244oorsGTJEjz55JOora0t+ufIVVJZZhFZAeBRAJcDWKSqhjWURWQJgG8BqAXwtKqut/P4bpVl9vOAHjmjWssc+w3LMltzqyzzPgDLAew0O0FEagE8CWApgCsA3CsiV5T4vGXlRm0Z8hbW3qFqVtKsHlU9COQdbV4E4JCqvp8+dwuA2wEcKOW5y8Wt2jLkLdUwFZXITCVy/HMAHM66HU8fMyQiERHZLSK7jx8/XvbG5fLzgB4RkR15A7+I7BCRfQb/brf5HEaXA6YDC6raqaoLVXXhjBkzbD6FM/w+oEdEZEfeVI+qljqSFQfQknW7GcDREh+zLKxqy3BAj4iqRSVSPW8BuFRE5orIBAD3APhpBZ63YBzQI6IgKCnwi8idIhIHcB2An4nIK+njTSKyDQBU9RyAhwC8AuAggBdUdX9pzS6P7OqN2f840Oe8WCyGUCiEmpoahEIhxEqsokhUDm6VZT5x4gRuvPFGTJw4EQ899FBRj2GlpMCvqj9W1WZVPV9VZ6nqZ9LHj6rqLVnnbVPV31PVi1W1+GV0VBVisRgikQh6enqgqujp6cEDD3wezz272e2mkc853aFwqyxzfX09nnjiCXzjG98o6v75cOUuVVxHRwf6c6ooJhKD+FpHu0stompg1KGIRCIlBf/sssyrV6/G66+/jhtvvBH33XcfrrrqKgDAHXfcgQULFmDevHno7OwcuW8oFMJHH32E7u5uXH755XjggQcwb9483HzzzUgkEpbPe8EFF+CGG25AfX190W23pKqe/bdgwQKl6iMiitTMrjH/RKDD5/rcbh55yIEDB2yf29bWZvi+amtrK/r5P/jgA503b97I7ddee00bGxv1/fffHzl24sQJVVXt7+/XefPm6UcffTTSnuPHj+sHH3ygtbW1+utf/1pVVVesWKHPPvusqqo+9dRT+tRTT5k+/zPPPKMPPvig6feNXh8AuzVPbGVZZqq41tZW9PT0jDve3HQRZ1BR0XpNyi+bHS9WJcoylxtTPVRx0WgUjTlVFBsa6rC2fQnXTVDRWk3KL5sdL1YlyjKXGwM/VVw4HEZnZydaWmZCBGiZMwUbN3wWK5YvAFdLU7GMOhSNjY2I+rAsc7kx1UOuCIfDuHVxHzTZl/Mdb+zJS/4TTpdf7ujoQG9vL1pbWxGNRkeOFyO7LPPSpUuxbNmyMd9fsmQJNm3ahKuvvhqXXXZZUWWZAeOUTygUwunTpzE4OIif/OQn2L59O664wpn6liWVZS43t8oyE5E3sCyzNbfKMhMRkc8w8BMRBQwDPxFRwDDwExEFDAM/EVHAMPATEQUMAz9RGSSH+3Dm2NNchexzbpVlfvXVV7FgwQJcddVVWLBgAX7xi18U9ThmGPiJyuBs35tIDp/iKuQKi8WAUAioqUn9X+o2D26VZZ4+fTr+/u//Hu+88w5++MMf4nOf+1xRj2OGgZ/IYaN7N4O1hyooFgMiEaCnB1BN/R+JlBb83SrLPH/+fDQ1NQEA5s2bh4GBAZw9e7b4HyQHAz+Rw8bu3czaQ5XS0QHkbPOA/v7U8WKtX78eF198Mfbu3YsNGzYAAHbt2oVoNIoDBw4AADZv3ow9e/Zg9+7d2LhxI06cODHucbq6uvDggw9i//79mDJlCl588UUAqZINmbINZl588UXMnz9/TKG3UrFWD5GDRnv7mTTAMAb79+P8idehpvYCq7tSicyqLztclbmiZZn379+PRx55BNu3b3fuBwB7/OQTfhksHdvbz2CvvxLMqi87XJW5YmWZ4/E47rzzTvzoRz/CxRdf7Ezj0xj4yRf8Mlg6NPAeRnv7GcPp41RO0SiQU5UZjY2p48VyqyzzyZMnsWzZMqxbtw7XX3+9I4+Zjake8rzcwVIvp00mz1rpdhMCK1N9uaMjld5pbU0F/RKqMrtWlvk73/kODh06hCeeeAJPPPEEAGD79u2YOXNm8T9MFpZlJs9LnNqRlTevxYTGK1mvPyBYltkayzJTVTIbLPV6rp/Iyxj4ydM4WErkPAZ+8jQOlpKX09FuKuV14eAueRoHS4Otvr4eJ06cwLRp0yAibjfHM1QVJ06cQH19fVH3Z+AnIs9qbm5GPB7H8ePH3W6K59TX16O5ubmo+zLwE5Fn1dXVjVklS85gjp+IKGAY+ImIAoaBn4goYDy9cldEjgPocenppwP4yKXnLhTbWh5+aivgr/ayreUxHcAFqjrD6iRPB343icjufMuevYJtLQ8/tRXwV3vZ1vKw21ameoiIAoaBn4goYBj4zXXmP8Uz2Nby8FNbAX+1l20tD1ttZY6fiChg2OMnIgoYBn4iooBh4DchIk+IyG9EZK+IbBeRJrfbZEVENojIu+k2/1hEprjdJjMiskJE9otIUkQ8OU1ORJaIyL+JyCER+arb7bEiIptF5JiI7HO7LVZEpEVEXhORg+nf/5fcbpMVEakXkV0i8na6vY+53aZ8RKRWRH4tIv9gdR4Dv7kNqnq1ql4L4B8ArHW5Pfm8CuBKVb0awL8DaHe5PVb2AVgOYKfbDTEiIrUAngSwFMAVAO4VkSvcbZWlHwBY4nYjbDgH4MuqejmATwN40OOv61kAf6Sq1wC4FsASESlsU93K+xKAg/lOYuA3oaqns25eAMDTo+Cqul1Vz6VvvgmguHqtFaCqB1X139xuh4VFAA6p6vuqOghgC4DbXW6TKVXdCeBjt9uRj6r+VlX/Nf31GaQC1Bx3W2VOU/rSN+vS/zwbB0SkGcAyAE/nO5eB34KIREXkMIAwvN/jz/Z5AD93uxE+NgfA4azbcXg4QPmRiIQAzAfwLy43xVI6dbIXwDEAr6qql9v7TQBfAZDMd2KgA7+I7BCRfQb/bgcAVe1Q1RYAMQAPudva/O1Nn9OB1CV1zL2W2murhxlt9eTZnp7fiMhEAC8C+B85V9aeo6rD6XRvM4BFInKly00yJCJ/AuCYqu6xc36gN2JR1cU2T/0bAD8D8JdlbE5e+dorIn8K4E8A/LG6vECjgNfWi+IAWrJuNwM46lJbqoqI1CEV9GOq+pLb7bFLVU+KyOtIjaV4cRD9egC3icgtAOoBTBaR51T1fqOTA93jtyIil2bdvA3Au261xQ4RWQLgEQC3qWq/2+3xubcAXCoic0VkAoB7APzU5Tb5nqQ2zf1rAAdV9f+63Z58RGRGZnaciDQAWAyPxgFVbVfVZlUNIfV+/YVZ0AcY+K2sT6cmfgPgZqRGy73sOwAmAXg1PQV1k9sNMiMid4pIHMB1AH4mIq+43aZs6UHyhwC8gtQA5Auqut/dVpkTkecB/ArAZSISF5H/7nabTFwP4HMA/ij9Ht2b7qF61WwAr6VjwFtI5fgtp0n6BUs2EBEFDHv8REQBw8BPRBQwDPxERAHDwE9EFDAM/EREAcPAT0QUMAz8REQB8/8B/WRGok1zZwcAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "from sklearn.decomposition import PCA\n",
    "\n",
    "pca_model = PCA()\n",
    "transformed_data_2D = pca_model.fit_transform(X_train)\n",
    "\n",
    "unlabeled_data = pca_model.transform(X_test)\n",
    "plt.scatter(unlabeled_data[:,0], unlabeled_data[:,1],\n",
    "            color='khaki', marker='^', label='test')\n",
    "\n",
    "for label in range(3):\n",
    "    data_subset = transformed_data_2D[y_train == label]\n",
    "    plt.scatter(data_subset[:,0], data_subset[:,1],\n",
    "            color=['r', 'k', 'b'][label], label=f'train: {label}')\n",
    "\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many unlabeled points cluster around _Species 0_. These unlabeled flowers clearly belong to the same species. Elsewhere in the plot, certain unlabeled flowers are proximate to both _Species 1_ and _Species 2_. For each such point, we'll need to quantify which labeled species are closer. This will require us to track the Euclidean distance between each feature in `X_test` and each feature in `X_train`. \n",
    "\n",
    "**Listing 20. 5. Computing Euclidean distances between points**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our first test set feature is [5.8 2.8 5.1 2.4]\n",
      "Our first training set feature is [5.1 3.5 1.4 0.2]\n",
      "The Euclidean distance between the features is 4.18\n"
     ]
    }
   ],
   "source": [
    "from sklearn.metrics.pairwise import euclidean_distances\n",
    "distance_matrix = euclidean_distances(X_test, X_train)\n",
    "\n",
    "f_train, f_test = X_test[0], X[0]\n",
    "distance = distance_matrix[0][0]\n",
    "print(f\"Our first test set feature is {f_train}\")\n",
    "print(f\"Our first training set feature is {f_test}\")\n",
    "print(f\"The Euclidean distance between the features is {distance:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can assume that each unlabeled point shares a class that is common to its neighbors. This strategy forms the basis for the **K-nearest Neighbors** algorithm, which is referred to as **KNN** for short. \n",
    "\n",
    "**Listing 20. 6. Labeling a point based on its nearest neighbors**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The 3 nearest neighbors of Point 10 have the following labels:\n",
      "[2 1 2]\n",
      "\n",
      "The most common class label is 2. It occurs 2 times.\n"
     ]
    }
   ],
   "source": [
    "from collections import Counter\n",
    "np.random.seed(6)\n",
    "random_index = np.random.randint(y_test.size)\n",
    "labeled_distances = distance_matrix[random_index]\n",
    "labeled_neighbors = np.argsort(labeled_distances)[:3]\n",
    "labels = y_train[labeled_neighbors]\n",
    "\n",
    "top_label, count = Counter(labels).most_common()[0]\n",
    "print(f\"The 3 nearest neighbors of Point {random_index} have the \"\n",
    "      f\"following labels:\\n{labels}\")\n",
    "print(f\"\\nThe most common class label is {top_label}. It occurs {count} \"\n",
    "       \"times.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The most common class label among the neighbors of _Point 10_ is _Label 2_. How does this compare to the actual class of the flower?\n",
    "\n",
    "**Listing 20. 7. Checking the true class of a predicted label**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The true class of Point 10 is 2.\n"
     ]
    }
   ],
   "source": [
    "true_label = y_test[random_index]\n",
    "print(f\"The true class of Point {random_index} is {true_label}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can reformulate KNN as a graph theory problem. We can treat each point as a node, and its label as a node attribute. Afterwards, we can choose an unlabeled point and extend edges to its K closest labeled neighbors. Visualizing the neighbor graph subsequently allows us to identify the point.\n",
    "\n",
    "**Listing 20. 8. Visualizing nearest neighbors with NetworkX**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAb4AAAEuCAYAAADx63eqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAggUlEQVR4nO3deXiU1cH+8XtmsifsIQSL4lq1/rQvFkUNgbCZBcJODArIvqi4oEVL61uKC60KKqVaERBEIbIIJGRhz0JaqdaiiAoiKqhA/CGCLIHMzPP+EUvLmm0mZ5bv57pyXZp5ZnKDXNw+55znHJtlWZYAAAgSdtMBAACoTxQfACCoUHwAgKBC8QEAggrFBwAIKhQfACCoUHwAgKBC8QEAggrFBwAIKiGmAwAAgtf3knZJOiwpRlJrSS28/DMpPgBAvbIk/V3Sc5LyJEX812vlkhIkTZTUTd4ZlrSxVycAoL7sl5QqaYekY6oswXOJkdRM0lpJV3k4A8UHAKgXX0u6SdL/l+SsxvU2SQ0kbZJ0vQdzUHwAAK87KumXkr6U5Krhe5tJ2iqppYeysKoTAOB1r0naq3OU3okT0ogRUuvWUoMGUps2Un7+aZcclvQnD2ah+AAAXmVJelaVc3pncTqliy+WioqkQ4ekJ56QMjKkL788dUmFpDmSjnsoD8UHAPCqYkkHzvdidLQ0ebJ06aWS3S716CFddpn0z3+edeliD+Wh+AAAXrVelXN81bJ/v7Rjh3Tddad9+4ikbA/lofgAAF61r7oXVlRId90l3X23dM01Z738nYfyUHwAAK+q1k4pbrc0eLAUFibNnFn7z/FUHgAAautnqiyb8z67Z1mVKzv375fy8qTQ0HNedpGH8nDHBwDwqr6Szl1lPxk3TvrkEyknR4qMPOclMZIGeygPD7ADALzuZknvnuuFr76qXNEZHi6F/Ncg5CuvVM73/SRe0jfyzN0aQ50AAK97VNLdOsfqztatK4c6LyBS0gR5boiSOz4AgNe5JXU4dEilYWHnHc48l1BJv1DlaQ7Vf9eFMccHAPC6v23apO3XX69rjx5VVDXfEyHpCknr5LnSkyg+AICXrVmzRn369NGbs2frg9hYjVBlkUWf5/pIVZZeL1XOC8Z6OA9DnQAAr1m+fLnGjBmjt99+W+3btz/1/cOSXpc0TdJuVR5BZElqLmm8pFGS4ryUieIDAHjFggULNHHiROXm5urGG28873WWKjegjlD9DEOyqhMA4HEvvfSSpk6dqg0bNujaa6+94LU2qdrzfp5A8QEAPGrq1KmaPXu2iouLddlll5mOcxaKDwDgEZZladKkScrOzlZJSYkuushTm4x5FsUHAKgzt9ut8ePHa/PmzSoqKlJsrKfXYnoOxQcAqBOn06nhw4friy++0Pr169WoUSPTkS6I4gMA1NqJEyc0cOBAHTt2TKtXr1ZUVH0uU6kdHmAHANTK0aNH1bNnT9ntdq1cudIvSk+i+AAAtfDDDz8oOTlZF110kbKyshQeHm46UrVRfACAGvnuu+/UuXNn3XjjjZozZ45CQvxr1oziAwBU2zfffKMOHTooLS1NL774oux2/6sR/0sMADBi165dSkxM1LBhw/Tkk0/KZrOZjlQrFB8AoEoff/yxOnbsqF//+teaOHGi6Th14l8DswCAevfPf/5T3bt313PPPadBgwaZjlNnFB8A4LxKSkrUr18/vfrqq+rVq5fpOB5B8QEAzqmgoECDBw/WokWL1LVrV9NxPIY5PgDAWZYtW6YhQ4Zo5cqVAVV6EsUHADjD/Pnzdd9992n16tW67bbbTMfxOIY6AQCnzJw5U88884w2btyoa665xnQcr6D4AACyLEtTp07V3LlzVVxcrEsvvdR0JK+h+AAgyFmWpccee0y5ubkqKSlRy5YtTUfyKooPAIKY2+3Wvffeq/fee09FRUVq1qyZ6UheR/EBQJByOp0aNmyYdu/erfXr16thw4amI9ULig8AgtCJEyeUmZmp8vJy5efn+81Zep7A4wwAEGSOHj2q9PR0hYSE+NUBsp5C8QFAEPnhhx90++23q1WrVlq0aJHCwsJMR6p3FB8ABImysjJ16tRJbdu21ezZs/3uAFlPofgAIAh8/fXX6tChg9LT0/XCCy/45QGynhK8v3IACBKff/65EhMTNWLECE2ZMsVvD5D1lOC8zwWAILFt2zYlJyfr8ccf15gxY0zH8QkUHwAEqPfee089evTQ9OnTdeedd5qO4zMoPgAIQMXFxerfv79mz56tnj17mo7jUyg+AAgw+fn5GjJkiLKystSlSxfTcXwOi1sAIIAsWbJEQ4cOVXZ2NqV3HhQfAASI1157TQ888IBWr16tW2+91XQcn8VQJwAEgBkzZui5557Txo0bdfXVV5uO49MoPgDwY5Zl6emnn9a8efNUUlKi1q1bm47k8yg+APBTlmXp0UcfVX5+voqLiwP+AFlPofgAwA+5XC7de++9ev/991VUVKSmTZuajuQ3KD4A8DMVFRUaOnSovvnmG61fv14NGjQwHcmvUHwA4EfKy8t1xx13yOl0Kj8/X5GRkaYj+R0eZwAAP3HkyBH16NFD4eHhWr58OaVXSxQfAPiBfx8g27p166A9QNZTKD4A8HFlZWVKSkpSu3bt9Oqrr8rhcJiO5NcoPgDwYXv27FFiYqJ69+6t6dOnB/UBsp7C7yAA+KidO3cqMTFRo0eP1uTJk4P+AFlPofgAwAd99NFH6tixoyZNmqSHH37YdJyAwuMMAOBj3n33XaWnp+v555/XwIEDTccJOBQfAPiQoqIiDRgwQHPmzFF6errpOAGJoU4A8BF5eXkaMGCAsrKyKD0vovgAwAcsXrxYw4YNU3Z2tjp37mw6TkCj+ADAsLlz5+rBBx/U2rVrdcstt5iOE/CY4wMAg1544QU9//zzKiws1M9//nPTcYICxQcABliWpSeffFILFixQSUmJLrnkEtORggbFBwD1zLIsTZw4UatXr1ZxcbHi4+NNRwoqFB8A1COXy6V77rlHW7ZsUWFhIQfIGkDxAUA9qaio0JAhQ7R//36tW7eOA2QNofgAoB6Ul5crIyNDbrdbubm5nKVnEI8zAICXHTlyRN27d1dUVBQHyPoAig8AvOjgwYPq1q2bLr/8cr355psKDQ01HSnoUXwA4CX79+9XUlKSbr31Vs2aNYsDZH0ExQcAXrB792516NBBffv21bRp0zhLz4ewuAUAPOyzzz5Tt27d9MADD+ihhx4yHQdnoPgAwIM+/PBDpaam6g9/+INGjhxpOg7OgeIDAA/ZvHmzevbsqRkzZuiOO+4wHQfnQfEBgAcUFhYqIyNDr732mrp37246Di6AxS0AUEe5ubnKyMjQW2+9Ren5AYoPAOrgrbfe0ogRI5STk6NOnTqZjoNqoPgAoJZmz56tCRMmaO3atWrXrp3pOKgm5vgAoBaef/55vfjiiyosLNRVV11lOg5qgOIDgBqwLEtTpkzRm2++qeLiYg6Q9UMUHwBUk2VZeuSRR7Ru3TqVlJSoRYsWpiOhFig+AKgGl8ulsWPHauvWrSosLFSTJk1MR0ItUXwAUIWKigoNHjxY3333ndauXcsBsn6O4gOACzh+/LgGDBggu92u3NxcRUREmI6EOuJxBgA4jx9//FFpaWlq2LChli1bRukFCIoPAM7h+++/V9euXXXVVVdpwYIFHCAbQCg+ADjDvn37lJSUpMTERL3yyiscIBtgKD4A+C//PkB2wIABevbZZzlANgBRfADwkx07digxMVH33nuvHn/8cUovQLGqEwAkffDBB0pNTdWTTz6p4cOHm44DL6L4AAS9d955R7169dKf//xnZWRkmI4DL6P4AAS1DRs2KDMzU/PmzVNaWprpOKgHzPEBCFo5OTnKzMzUkiVLKL0gQvEBCEpZWVkaNWqUVq1apY4dO5qOg3pE8QEIOrNmzdLDDz+sdevW6eabbzYdB/WMOT4AQWXatGmaOXOmioqKdOWVV5qOAwMoPgBBwbIsTZ48WVlZWSouLtbFF19sOhIMofgABDzLsjRhwgRt3LhRJSUliouLMx0JBlF8AAKay+XSmDFj9PHHH2vjxo0cIAuKD0DgOnnypAYPHqwDBw5ozZo1iomJMR0JPoDiAxCQjh8/rv79+yskJESrVq3iLD2cwuMMAALO4cOHlZqaqsaNG2vp0qWUHk5D8QEIKAcOHFDXrl11zTXXcIAszoniAxAw9u7dq6SkJCUlJenll1+W3c5fcTgbfyoABISvvvpKHTp0UGZmpv70pz9xlh7Oi+ID4Pe2b9+uxMREjR8/Xr/97W8pPVwQqzoB+LUtW7YoLS1NTz31lIYNG2Y6DvwAxQfAb/39739X79699Ze//EX9+/c3HQd+guID4JfWr1+vzMxMvf7660pNTTUdB36EOT4Afic7O1sDBw7UsmXLKD3UGMUHwK8sXLhQo0ePVl5enjp06GA6DvwQxQfAb8yaNUsTJ07UunXr1LZtW9Nx4KeY4wPgF5599lm9/PLLKioq0hVXXGE6DvwYxQfAp1mWpf/93//V0qVLVVxcrFatWpmOBD/n1eI7LOl1SbmSvpcUKqmVpKGSbhfjrAAuzO1266GHHlJxcbGKioo4QBYeYbMsy/L0h34h6QlJWZJsko6d8XrMT18TJN0vKdzTAQD4PZfLpVGjRmn79u3Kzc1V48aNTUdCgPB48f1NUqqko5JcVVwbKek6SWskcSYygH87efKkBg0apIMHD2rFihWKjo42HQkBxKOjjf9S5RDmYVVdepJ0XNKHkpJ++mcAOHbsmHr37q2TJ08qJyeH0oPHeaz4TkpKVuWd3mlOnJBGjJBat5YaNJDatJHy80973w5J4z0VBIDf+vcBss2aNdOSJUs4QBZe4bHiW67z3LU5ndLFF0tFRdKhQ9ITT0gZGdKXX566pFzSQkmHPBUGgE/YsmWLRo4cqdtuu0033HCD2rdvr/Hjx2v79u1nXXvgwAF16dJF1113nebPn88BsvAaj83x3ajKoc5queEG6fe/l/r1O/WtKEl/FHd+QCBYsmSJpkyZol27dunEiRNyuf4z+RESEqKQkBDdcMMNmjJlipKTk7V3715169ZNPXr00NSpUzlWCF7lkeLbKekGVXOebv/+ymHPLVuka6457aXLJX1e1zAAjHG73Ro/frzmz5+vo0fPmvg4S1RUlMaOHasVK1Zo5MiR+s1vflMPKRHsPFJ86yT1VzWGKisqpNRU6YorpFdeOevlCLHIBfBn48eP19y5c3Xs2JkPMZ2fzWZTenq6Vq5c6cVkwH94pPhWSLpblas5z8vtlu68Uzp8WFq5UjrX+L3LpfhWrRQaGlqtr5CQkGpfW9uvmvwMh8NR199KwG/l5OQoMzOzRqX3b1FRUdqwYYPatWvnhWTA6TxSfBsl9dEF7vgsSxo+vHJBS16eFBl5zssiLUs79+5VRUVFtb+cTmeNrq/tV3V+jqR6Kdj6+hnMs6Am2rVrp3/84x+1eq/NZlOfPn20bNkyD6cCzuaR4vtWlfNzJ853wdixlXN669ZJMTHn/ZxfStpS1zAGuVwur5drfZW40+mUw+EImBJ3OBwUuRd9+umnatOmjcrLy2v9GREREdq9e7eaN2/uwWTA2TyyV+dFkhJVOdd3lq++qpzPCw+X4uP/8/1XXpHuuuvUv8ZImuiJMAY5HA45HI6AePbIsqw6FXFN3nv06FGv/xy32x0wJR4aGiq73bd2up09e7acTmedP2fhwoV64IEHPJAIOD+PbVI9UdI7ko6c+ULr1pVDnVWwSepX5VWoLzab7dRfsoHA7XbX2534sWPHvP5z7Ha7TxV4cXFxnYuvvLxcn332mYf+iwPn57Hi66LKkxd2SqrpH/8oSQ+LzarhPXa7XeHh4QoP9/8/ZZZl1XpYvaYlfvz4cR0+fLjK63bu3OmRX9uhQ2xjAe/zWPHZVTnU+T+SDqp6e3VKlaXXTdLjngoCBDibzXbqIfDI8ywUq299+vTRihUr6vw5sbGxdQ8DVMGjEwU/k/SeKu/8oqpxfbSkvpKWejoIgHp1880313luOyYmRm3atPFQIuD8vHIe3xFJ8yU9K+mAKs/jc//0Wrgq5/NuU+W84O0//TsA/1VWVqZLLrlEJ06cd213lWJiYlRWVuYzd7EIXF45gT1G0r2S7pFUIqlI0n5Vnr/XUpWLWFp74wcDMCIuLk5paWlasWKFavP/0qGhoRo+fDilh3rhlTs+AMHnX//6lxISEnT8eM03HoyOjtZHH32kSy+91PPBgDMwtQbAI9q0aaNZs2bV+K7N4XBo6dKllB7qDcUHwGMGDRqkefPmKSoqqspnQMPDwxUdHa2rr75a69adc/sLwCsoPgAelZGRoQ8//FDjxo1TTEyMYmJiTm0XZ7fb1aBBAzVu3FgPP/ywtm/frpKSEuXn52vatGmGkyNYMMcHwGuOHz+u5cuXa9euXfrhhx/UpEkTXXvttUpPTz/tjnDPnj1KSEjQ008/rUGDBhlMjGBA8QHwCdu2bVPnzp31+uuvKzk52XQcBDCKD4DPKC0tVe/evZWXl6ebbrrJdBwEKOb4APiMhIQEzZkzRz179tSOHTtMx0GA8soD7ABQWz179lRZWZlSUlJUWlqqli1bmo6EAEPxAfA5I0eO1L59+5SamqqioiI1atTIdCQEEOb4APgky7I0fvx4bdu2TQUFBQFxpBR8A8UHwGe5XC5lZmbKZrNp0aJFcjgcpiMhALC4BYDPcjgcWrBggcrKyvTggw/WagNs4EwUHwCfFhERoZUrV6q4uFhTp041HQcBgMUtAHxeo0aNlJ+fr4SEBMXHx2v48OGmI8GPUXwA/MJFF12kgoICdezYUXFxcerRo4fpSPBTLG4B4Fc2b96sHj16KDs7W7feeqvpOPBDzPEB8Cvt2rXT66+/rj59+uiTTz4xHQd+iOID4HdSU1P1zDPPKCUlRV9//bXpOPAzzPEB8EtDhgzRvn37lJKSopKSEjVp0sR0JPgJ5vgA+C3LsjRhwgS99957WrNmjSIjI01Hgh+g+AD4NbfbrUGDBunYsWNaunSpQkIYyMKFMccHwK/Z7XbNmzdPx44d0z333MPuLqgSxQfA74WFhWnZsmV6//33NXnyZNNx4OMYEwAQEBo0aKC8vDwlJCSoZcuWGjt2rOlI8FEUH4CAERcXp4KCAiUmJiouLk59+/Y1HQk+iOIDEFCuuOIKrVq1SikpKWrWrJk6duxoOhJ8DHN8AALOjTfeqIULF2rAgAHaunWr6TjwMRQfgIDUtWtXzZgxQ2lpafrqq69Mx4EPYagTQMDKzMzU/v37lZycrE2bNik2NtZ0JPgAHmAHEPAee+wxFRYWav369YqOjjYdB4ZRfAACnmVZGjZsmL777jutWLFCoaGhpiPBIOb4AAQ8m82mV199VZI0atQodncJchQfgKAQGhqqxYsX69NPP9WkSZNMx4FBFB+AoBEdHa1Vq1Zp+fLlmjFjhuk4MIRVnQCCSmxsrFavXq327dsrLi5OmZmZpiOhnlF8AIJO69atlZeXpy5duig2NlZdu3Y1HQn1iKFOAEHp+uuv19KlS3XnnXfq/fffNx0H9YjiAxC0OnTooL/+9a/q0aOHPv/8c9NxUE8Y6gQQ1Pr27auysjIlJyertLRULVq0MB0JXkbxAQh6Y8eO1d69e5WWlqbCwkI1aNDAdCR4ETu3AIAqd3cZO3asdu3apdzcXIWFhZmOBC+h+ADgJ06nU/3791dUVJTeeOMN2e0sgwhE/FcFgJ+EhIRo0aJF2rNnjx555BG2NgtQFB8A/JfIyEhlZ2drzZo1eu6550zHgRewuAUAztCkSRMVFBQoISFB8fHxGjx4sOlI8CCKDwDOoVWrViooKFCnTp3UvHlzpaSkmI4ED2FxCwBcwN/+9jf16tVLubm5uvnmm03HgQcwxwcAF3Dbbbdp7ty56tWrl3bs2GE6DjyA4gOAKqSnp+vpp59WcnKyvv32W9NxUEfM8QFANQwbNkx79+5VamqqiouL1ahRI9ORUEvM8QFANVmWpfvvv19bt25VQUGBIiIiTEdCLVB8AFADLpdLAwcOlNvt1ltvvSWHw2E6EmqIOT4AqAGHw6EFCxbo+++/1/3338/uLn6I4gOAGgoPD9fy5ctVWlqqp556ynQc1BCLWwCgFho1aqT8/PxTu7uMHDnSdCRUE8UHALXUsmVLFRQUqGPHjoqLi1PPnj1NR0I1sLgFAOro3XffVVpamlasWKGEhATTcVAF5vgAoI5uuukmvfHGG+rbt6+2bdtmOg6qQPEBgAckJydr2rRpSk1N1Z49e0zHwQUwxwcAHjJo0CDt27dPKSkpKikpUdOmTU1HwjkwxwcAHvbII4/onXfe0dq1axUZGWk6Ds5A8QGAh7ndbg0ZMkQ//vijli1bppAQBtd8CXN8AOBhdrtdc+fOVXl5ucaNG8fuLj6G4gMALwgLC9OyZcu0ZcsW/f73vzcdB/+F+28A8JKYmBjl5uaqffv2io+P1z333GM6EkTxAYBXxcXFqaCgQImJiYqLi1P//v1NRwp6FB8AeNnll1+uVatWKTk5WbGxsUpKSjIdKagxxwcA9aBNmzbKyspSRkaGPvjgA9NxghrFBwD1pHPnzpo5c6a6d++uL7/80nScoMVQJwDUo4yMDO3fv1/JycnatGmTmjdvbjpS0OEBdgAwYNKkSVq/fr02bNig6Oho03GCCsUHAAZYlqURI0Zo3759WrlypUJDQ01HChoUHwAY4nQ61bt3bzVr1kzz5s2TzWYzHSkosLgFAAwJCQnR4sWLtWPHDj322GOm4wQNig8ADIqKitKqVauUk5OjF154wXScoMCqTgAwrFmzZiooKFD79u3VokULDRw40HSkgEbxAYAPuOSSS5SXl6cuXbooNjZW3bp1Mx0pYLG4BQB8SElJifr166f8/Hz96le/Mh0nIDHHBwA+JDExUbNmzVJ6erp27txpOk5AYqgTAHxM7969VVZWpuTkZJWWlio+Pt50pIBC8QGADxo9erT27t2rtLQ0FRYWqmHDhqYjBQzm+ADAR1mWpXHjxmnnzp3Kzc1VeHi46UgBgeIDAB/mcrk0YMAAhYWFaeHChbLbWZpRV/wOAoAPczgcWrhwob799ltNmDBB3KvUHcUHAD4uIiJC2dnZWr9+vZ555hnTcfwei1sAwA80btxYBQUFSkhIUHx8vO6++27TkfwWxQcAfuJnP/uZCgoKlJSUpObNmystLc10JL/E4hYA8DPvvPOOevbsqZycHLVr1850HL/DHB8A+JlbbrlFr732mnr16qXt27ebjuN3KD4A8EPdu3fXH//4R6WkpOjbb781HcevMMcHAH5q6NCh2rdvn1JSUlRcXKzGjRubjuQXmOMDAD9mWZYefPBBbdmyRatXr1ZERITpSD6P4gMAP+d2u3XnnXeqoqJCixcvlsPhMB3JpzHHBwB+zm63a/78+Tp06JDuu+8+dnepAsUHAAEgPDxcb7/9tjZv3qwnnnjCdByfxuIWAAgQDRs2VF5e3qndXUaPHm06kk+i+AAggMTHx2v16tXq0KGD4uLi1Lt3b9ORfA7FBwAB5sorr1R2drbS0tIUGxur9u3bm47kU5jjA4AA1LZtW73xxhvq16+fPvroI9NxfArFBwAB6vbbb9f06dOVlpam3bt3m47jMxjqBIAAdtddd6msrEwpKSnatGmTmjZtajqScTzADgBBYOLEidq0aZPWrVunqKgo03GMovgAIAi43W4NHTpUBw8e1PLlyxUSErwDfszxAUAQsNvtmjNnjpxOp8aMGRPUu7tQfAAQJEJDQ7VkyRJt3bpVjz/+uOk4xgTvvS4ABKGYmBjl5uaqffv2io+P13333Wc6Ur2j+AAgyDRv3lwFBQVKTExUixYtNGDAANOR6hXFBwBB6LLLLlNubq66deum2NhYderUyXSkesMcHwAEqV/+8pdavHix7rjjDm3ZssV0nHpD8QFAEEtKStJLL72k7t2764svvjAdp14w1AkAQa5///7av3+/kpOTVVpaqubNm5uO5FU8wA4AkCT97ne/05o1a7RhwwbFxMSYjuM1FB8AQJJkWZZGjRqlr7/+WtnZ2QoLCzMdySsoPgDAKU6nU3379lWjRo00f/582e2BtxQk8H5FAIBaCwkJUVZWlnbt2qVHH33UdByvoPgAAKeJiopSTk6O8vLyNH36dNNxPI5VnQCAszRt2lQFBQVKSEhQixYtdNddd5mO5DEUHwDgnC6++GLl5+erc+fOat68uW6//XbTkTyCxS0AgAsqLS1Vnz59lJeXp7Zt2572mst5UK6T38qyTkiyy+6IVkj4pbLZQs2ErQaKDwBQpezsbI0dO1ZFRUW68sor5DzxuU4ceVeuijLJZpcstyRb5T/LUljkLxQW/Ss5QpqYjn4Wig8AUC2zZ8/Wyy8/r3WrJshuOypZFRe42i7JrvAGtyk8uq1sNlt9xawSxQcAqBa3+7i++XymIsMthYY6qvcmW4jCom5UZMNE74arAR5nAABUybIsHT2wVA1jQqpfepJkOXXy6Ps6eewT74WrIYoPAFAl18ndcrsOSnKf9v1ZczcpKeUFxV36qMY9mHWedztV/mOxfGWAkccZAABVKj/y7jnn9OLjG+mRB7pqQ9F2HS8//5yfZZ2Q6+QehYRf4s2Y1cIdHwDggtyuI3Kd/Pqcr/VMu149Uv+fmjaJuvCHWBWV5ekDKD4AwAVVPrJQg3m983BX7PNAmrqj+AAAF1T5cHrd5+csy1n3MB5A8QEALshmC5HkgefwbL5ROb6RAgDgs2z2GHnijs9ur2IesJ5QfACAC3KExstmCz/na06nS+XlFXK5LLlcbpWXV8jpdJ19oS1EoVH/492g1cTOLQCAKpUfeU8njpRKZ8zTTX1utf40fe1p33t0Qjf95pHkMz7BoYYtxspmj/By0qpRfACAKlnuch0ue+Ws4qseh0IjrlZUk1SP56oNhjoBAFWy2SMU1biHar7viV12R4wiG3X2RqxaofgAANUSGnGFIht1U/XLzy67o6Gim2XKZj/3HKEJDHUCAGrEefIblR8ukqviO1Xu3Xn6/p366RDasMhfKKJBok+VnkTxAQBqyeX8XiePvi/nid2yrJOS7LI5ohUedYNCI6/x2VPYKT4AQFBhjg8AEFQoPgBAUKH4AABBheIDAAQVig8AEFQoPgBAUKH4AABBheIDAAQVig8AEFQoPgBAUPk/O/7vMLuetw4AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import networkx as nx\n",
    "np.random.seed(0)\n",
    "\n",
    "def generate_neighbor_graph(unlabeled_index, labeled_neighbors):\n",
    "    G = nx.Graph()\n",
    "    nodes = [(i, {'label': y_train[i]}) for i in labeled_neighbors]\n",
    "    nodes.append((unlabeled_index, {'label': 'U'}))\n",
    "    G.add_nodes_from(nodes)\n",
    "    G.add_edges_from([(i, unlabeled_index) for i in labeled_neighbors])\n",
    "    labels = y_train[labeled_neighbors]\n",
    "    label_colors = ['pink', 'khaki', 'cyan']\n",
    "    colors = [label_colors[y_train[i]] for i in labeled_neighbors] + ['k']\n",
    "    labels = {i: G.nodes[i]['label'] for i in G.nodes}\n",
    "    nx.draw(G, node_color=colors, labels=labels, with_labels=True)\n",
    "    plt.show()\n",
    "    return G\n",
    "    \n",
    "G = generate_neighbor_graph(random_index, labeled_neighbors)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "KNN works when there are just three neighbors. What happens if we increase the neighbor count to four?\n",
    "\n",
    "**Listing 20. 9. Increasing the number of nearest neighbors**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAb4AAAEuCAYAAADx63eqAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAsiklEQVR4nO3de3RU9b3+8ffkQgg3MVpRUKRiBUEoHEm4GUFjBAPBJCTA7LGNnlOP2KO1Sn/1VFvqamsLPVW09FiLAp7K7EmAcAsBNFxMgoAEUeGAgKKgpFxEkEDumdm/P7A5XoKGMJM9l+e1lmu5MnsmD0vMk+/e370/DsuyLERERCJElN0BRERE2pKKT0REIoqKT0REIoqKT0REIoqKT0REIoqKT0REIoqKT0REIoqKT0REIoqKT0REIkqM3QHE/3yNp/D5qsBqxBHVnqiYi3E4Yu2OJSISFFR8YcKyGmmo2Udd1VZ8jafA8cXFvEW7+H606/gvRMck2JZRRCQYOPSsztDXWHeQqpMrAAushnMcFQU4iGnfmw5d78Dh0O88IhKZVHwhrr5mLzWfrQEaW/iOGKJjv0PHSyap/EQkImlzSwhrrP/HeZYeQCPehk+oPlmIfucRkUik4gthtZXraa705szbyOixz3BZr0e5/6d5zbyzkcb6j/A2HAl4RhGRYKPiC1Hehk/xNnza7GuXX34RP3voNu6aknTuD7C81FVtC1A6EZHgpYs8Iaqu6k3A2+xrE9IGAPD2jo+pOHzqHJ9g0Vi7H5+vhqio+MCEFBEJQlrxhajGug+BC7xG54jGW/8Pv+QREQkVKr4QZVn1/vgULF+tHz5HRCR0qPhClsNPH6O/AiISWfRTL0Q5otr741Nw6PqeiEQYFV+IahffD4hu9rXGRi+1tQ14vRZer4/a2gYaG5vbCGMR0+7KgOYUEQk2enJLiPJ5qzh97AWa29n5hz+9wsyni7/0tUcfSeUXPxvzha9E0a7DIOIvuiWwQUVEgoyKL4RVnVhGY90HtG53ZzSdv3MPUTEX+TuWiEhQ06nOEBZ/0a04HHGteGcMcZ2SVHoiEpFUfCEsKroLHS+ZBI44WrrLs7qmnlPVVxDXaXhgw4mIBCkVX4iLjv0OnS/9IdGxV3D2QTzn+E/qiAVHHPs/upRbbv8FJ0+ebMuYIiJBQ9f4woi38ST1Vdupr9kNVj3/XAVGxVxG+06JxLS/FocjmmnTpvG///u/rFq1iujo5neGioiEKxVfmLIsH+BrduZeY2MjY8aMISkpiT/84Q9tH05ExEY61RmmHI6ocw6ajYmJIS8vD4/Hw+LFi9s4mYiIvbTii2Dbt29nzJgxvPbaa/Tv39/uOCIibUIrvgj2L//yLzz99NNkZGTw2Wef2R1HRKRNaMUnPPTQQ7z//vsUFhYSFaXfhUQkvOmnnPCnP/2JM2fO8MQTT9gdRUQk4FR8QmxsLAsXLuSll15i2bJldscREQkoneqUJlu3bmXcuHGUlpZy/fXX2x1HRCQgtOKTJklJScycOZPMzEwqKyvtjiMiEhBa8cnX/PjHP+Yf//gHS5Ys0WYXEQk7+qkmX/PMM8/wySef8OSTT9odRUTE77Tik2YdPnyYxMREnn/+ecaPH293HBERv1HxyTlt3ryZO++8k40bN3LdddfZHUdExC90qlPOafjw4fz2t78lMzOT06dP2x1HRMQvtOKTb2RZFv/+7//OiRMnWLx4MQ5HywbeiogEK6345Bs5HA7+8pe/UFFRwYwZM+yOIyJywbTikxapqKggMTGRefPmMXbsWLvjiIi0mopPWqysrIzs7Gw2bdpE79697Y4jItIqOtUpLZacnMz06dPJzMykqqrK7jgiIq2iFZ+cF8uy+Nd//VdqamrweDza7CIiIUcrPjkvDoeDv/71r+zfv5+nnnrK7jgiIudNKz5plY8//pikpCRefvllbrvtNrvjiIi0mFZ80ipXXXUVHo+Hu+66iw8//NDuOCIiLabik1YbPXo0//mf/0lWVhbV1dV2xxERaRGd6pQLYlkWP/jBDwB4+eWXtdlFRIKeVnxyQRwOB3PmzGHXrl08++yzdscREflWWvGJXxw4cIBhw4bh8Xi45ZZb7I4jInJOWvGJX/Tq1YsFCxZgGAYfffSR3XFERM5JxSd+c9tttzFt2jSysrKoqamxO46ISLN0qlP8yrIsnE4n7du3Z/78+drsIiJBRys+8SuHw8HcuXPZvn07zz33nN1xRES+Ris+CYj9+/czYsQIFi9eTHJyst1xRESaaMUnAdG7d2/+53/+h8mTJ3Po0CG744iINFHxScCMHTuWBx98kIkTJ1JXV2d3HBERQKc6JcAsyyInJ4euXbvywgsvaLOLiNhOKz4JKIfDwfz589m8eTNz5syxO46IiFZ80jbee+89Ro4cyfLlyxk+fLjdcUQkgmnFJ23ie9/7HvPnzycnJ4fDhw/bHUdEIpiKT9rMuHHjuO+++8jOzqa+vt7uOCISoXSqU9qUz+cjKyuL7t276wZ3EbGFVnzSpqKiovj73//O+vXrmTt3rt1xRCQCacUnttizZw/JyckUFRWRlJRkdxwRiSBa8Ykt+vbtywsvvMDEiRM5evSo3XFEJIKo+MQ2GRkZ3HPPPeTk5NDQ0GB3HBFpQxawHhgP9AK+A1wF3AIsBxoD+L11qlNs5fP5SE9Pp3fv3vz5z3+2O46IBJgFzAOeAD4DzjRzTGegHfAz4Of4f4Wm4hPbffbZZyQmJvKrX/2KH/7wh3bHEZEA8QH3AwuA6hYc3wEYBSwF4vyYQ8UnQWHXrl2MHj2aNWvWcOONN9odR0QC4GFgDi0rvX+KB24HluC/lZ+u8UlQ6N+/P88//zwTJ07kk08+sTuOiPjZBs5RenV18G//BldfDZ07w+DBsHp108s1wFrg737MouKToDFx4kQMw2Dy5Mk0Ngby0raItLU/co6VXmMjXHUVlJTAqVPw29/CpElw4EDTIVXADD9m0alOCSper5dx48bRv39/nnrqKbvjiIgfVADXArUtfcPAgfDrX8PEiU1f6sjZVWOiH/JoxSdBJTo6GtM0WbZsGaZp2h1HRPwgn7O7OVvk6FHYtw/69//Sl2uAF/2UJ8ZPnyPiNwkJCSxdupSUlBT69evHoEGD7I4kIhfgAFDXkgMbGsDlgtxc6Nv3Sy/5gA/9lEcrPglKAwcOZPbs2WRmZvLpp5/aHUdELkBVSw7y+eAHP4B27eAvf2n2kPPZDfpNVHwStKZMmUJ2djZOpxOv12t3HBFppcu+7QDLOruz8+hRKCiA2NhmD7vET3lUfBLU/vCHP+Dz+Xj88cftjiIirdSvspK4b5rBef/98O67UFgI8fHNHtIBGO2nPNrVKUHv+PHjJCYmMnPmTCZNmmR3HBFpgerqalasWIHb7abk9depO3iQ+s6dv37gwYPQqxfExUHMF7ad/O1vZ6/3fa49cBjo6odsKj4JCW+99Ra3334769evZ8CAAXbHEZFmNDY2snbtWtxuN4WFhQwbNgyXy0VGRgZPd+7MDM7jloYviAamcPZRZ/6g4pOQsWDBAp544gnKy8u5+OKL7Y4jIoBlWbzxxhu43W4WLlzId7/7XVwuF5MmTaJbt25Nx30C9AVOtOJ7dAS2ff5+f1DxSUj56U9/yr59+ygsLCQ6OtruOCIRa8+ePbjdbkzTJDY2FpfLhdPp5Nprrz3ne94Gkml+IsO5xHP2IdVjLijtl6n4JKQ0NDSQmprKTTfdxO9+9zu744hElIqKCvLy8jBNk8OHD+N0OnG5XAwePBiHw9Giz9gJpHD2hvRvKsAOnD3FuQL/bWr5JxWfhJxjx46RmJjIrFmzyMrKsjuOSFj77LPPKCgowO128/bbb5OZmYnL5WLUqFGtPutSDXg4+/zNw4CDs4Nn/7m1pQswDbgHCMRFDRWfhKTy8nLS0tIoKSmhX79+dscRCSu1tbUUFRXhdrtZt24dt912Gy6Xi7S0NNq3b++372MBbwLvApVAJ+AaYCSBvddOxSch66WXXuL3v/895eXlXHTRRXbHEQlpXq+X1157DbfbzbJlyxg8eDAul4usrCy6du1qdzy/UvFJSHvggQf46KOPWLZsGVFReh6DyPmwLIvt27fjdrvJy8vjiiuuwOVyMXnyZHr06GF3vIBR8UlIq6+vJyUlhdtuu41f//rXdscRCQnvv/8+pmlimiYNDQ24XC4Mw6BvX3/dMBDcVHwS8o4cOUJiYiL//d//zYQJE+yOIxKUjh49Sn5+Pm63mwMHDjB58mRcLhdJSUkt3pEZLlR8Eha2bNnChAkTKCsro0+fPnbHEQkKp0+fZunSpbjdbt544w0mTJiAy+UiJSWFmJjInUqn4pOw8cILL/D000/zxhtv0KVLF7vjiNiivr6eNWvW4Ha7WbNmDaNGjcLlcpGenk6HDh3sjhcUVHwSVu677z6OHTtGQUGBNrtIxPD5fGzcuBG3201BQQH9+vXD5XKRnZ3NJZf4a5hP+FDxSVipq6tj9OjRjB8/XqOMJOzt2LEDt9uNx+Oha9euuFwupkyZwtVXX213tKAWuSd5JSzFxcVRUFBAYmIigwcPJi0tze5IIn518ODBph2ZlZWVGIZBUVGRppacB634JCy9/vrrZGZmsmnTpm98aK5IKDh+/DiLFi3C7XazZ88ecnJyMAyDkSNH6pR+K6j4JGz99a9/5bnnnmPz5s106tTJ7jgi56WqqqppkGtZWRlpaWkYhsGYMWNo166d3fFCmopPwpZlWfzoRz/i9OnT5OfnR9y9ShJ6Ghoamga5rly5kuHDh2MYBhkZGXRubnq5tIqKT8JabW0tN998M9nZ2fz85z+3O47I11iWxZYtW5oGufbu3RvDML42yFX8R5tbJKy1b9+egoIChg4dyqBBg7j99tvtjiQCwLvvvts0yDUuLg6Xy8XmzZvp3bu33dHCnlZ8EhFKSkqYNGkSmzdv5pprrrE7jkSoiooKPB4PbrebY8eO4XQ6MQzjvAa5yoVT8UnE+POf/8zcuXPZtGkTHTt2tDuORIiTJ082DXJ95513yMrKwjCMCxrkKhdGxScRw7IscnNzaWxsxO126zdsCZja2lpWrlyJ2+1m/fr1pKamYhiG3we5Suuo+CSi1NTUMHLkSO666y4eeeQRu+NIGPF6vWzYsKFpkOuNN96IYRhhOcg11Kn4JOIcPHiQoUOHYpomt956q91xJIRZlsWbb76J2+0mPz+f7t27Nw1y7d69u93x5BxUfBKR1q9fj8vlYsuWLXquoZy39957r+mxYV6vF5fLhdPpjJhBrqFOxScR6+mnn8btdrNx40bi4+PtjiNB7siRI02DXA8ePBjRg1xDnYpPIpZlWbhcLmJjY3nppZf0w0u+prKysmmQ69atWzXINUyo+CSiVVdXM3z4cH70ox/x4IMP2h1HgkB9fT2rV6/G7XbzyiuvaJBrGFLxScT74IMPGD58OAsXLmTUqFF2xxEb+Hw+ysrKmga59u/fX4Ncw5iKTwR49dVXyc3NZevWrVx11VV2x5E2YFnWlwa5JiQkYBgGTqeTnj172h1PAkjFJ/K5mTNnUlBQQGlpqW4yDmMHDhzANE3cbjdnzpzBMAxcLhc33HCD3dGkjaj4RD5nWRaTJk2iS5cuvPjii9rsEkaOHz/OwoULcbvd7N27l5ycHFwuFyNGjNAg1wik4hP5gjNnzjBs2DAeeOABpk6danccuQBVVVUsX7686ZaVtLQ0XC4Xt99+uwa5RjgVn8hXvP/++4wcOZIlS5YwcuRIu+PIeWhoaKC4uBi3201RURHDhw/H5XKRkZFBp06d7I4nQULFJ9KMVatWce+991JeXq5HTwU5y7LYvHkzbrebRYsWce211zYNcr3sssvsjidBSMUncg5PPvkkRUVFbNiwgbi4OLvjyFfs3r27aZBr+/btcblcGIaheYvyrVR8Iufg8/mYOHEi3bp14/nnn7c7jgCHDh3C4/FgmmbTIFeXy8WgQYO0GUlaTMUn8g0qKysZOnQojzzyCPfee6/dcSLSyZMnWbx4MaZpNg1ydblc3HzzzRrkKq2i4hP5Fnv37uWmm26isLCQYcOG2R0nItTU1LBy5UpM02wa5Opyubjjjjt0j6VcMBWfSAusWLGC//iP/6C8vJzLL7/c7jhhyev1sn79ekzTbBrk6nK5yMrK4qKLLrI7noQRFZ9ICz3xxBOsW7eOdevW6T4wP7Esi23btjUNcu3Ro4cGuUrAqfhEWsjn83HnnXfSq1cvZs+ebXeckPbee+817cj0+XxNOzL79OljdzSJACo+kfNw6tQpkpKS+MUvfsHdd99td5yQcuTIEfLy8jBNk48++qhpkGtiYqJ2ZEqbUvGJnKfdu3czatQoVq9ezZAhQ+yOE9QqKytZsmQJpmlSXl7eNMj11ltv1SBXsY2KT6QVlixZwsMPP0x5ebmeDvIVdXV1rF69GtM0eeWVVxg9ejQul4vx48drkKsEBRWfSCs9/vjjvP766xQXFxMbG2t3HFv5fD5KS0sxTZOCggJuuOGGpkGuCQkJdscT+RIVn0greb1exo8fT9++fZk1a5bdcdqcZVm88847TYNcL7nkElwuF06nU8N8JajpJLtIK0VHR2OaJomJidx4443cdddddkdqEx9++CGmaWKaJlVVVRiGwZo1azTIVUKGVnwiF2jnzp3ceuutvPrqqwwePNjuOAHxySefsHDhQkzTZN++fU2DXIcPH65BrhJyVHwifrBw4UIeffRRysvLufTSS+2O4xdnzpxh+fLlmKbJ66+//qVBrpF+TVNCm4pPxE8effRR3nzzTdasWROyW/UbGhp49dVXMU2ToqIiRowYgcvl4s4779QgVwkbKj4RP/F6vdxxxx0MGjSIP/7xj3bHaTHLsti0aVPTINfvfe97uFwucnJydKuGhKXQ/LVUJAhFR0fj8XiaNrtMnjzZ7kjfaNeuXU07MuPj43G5XLzxxhsa5CphTys+ET97++23SU1NZd26dQwcONDuOF/y8ccfNw1yPX78OE6nE8MwNMhVIoqKTyQATNPkV7/6FeXl5bbfwH3ixImmQa47d+4kKysLwzA0yFUilopPJEAeeeQRdu/eTVFRUVPB+Hw+ysrK+OCDDzhz5gxdunShX79+DBkyxK8rrpqaGgoLCzFNkw0bNnD77bdjGAZpaWnExcX57fuIhCIVn0iANDY2kpqayvDhw/nZz37GvHnzeOqppzhz5gyWZeH1eomJicGyLK644goeffRRnE4nHTt2bPX3W79+PW63mxUrVjBkyBBcLheZmZka5CryBSo+kQD65JNPuOGGG6isrCQqKorq6upzHtuxY0fi4+NZv349AwYMaNHnW5ZFeXl50yDXq666qmmQ6xVXXOGvP4ZIWNGuTpEA2rZtG5WVldTW1n7rsVVVVVRVVTFixAjKysoYNGjQOY/dt29f0yBXAJfLRUlJiQa5irSAVnwiAfLuu+8yZMiQb1zlnUtCQgJ79uzhO9/5TtPXDh8+3DTI9eOPP2bKlCkYhqFBriLnSQ/ZEwmQ3/72ty1a6TWnurqa559/nlOnTjF//nxSU1Pp168f77zzDk8++SSHDh3imWeeISkpSaUncp604hMJgBMnTtCjR49WFx9Au3btiIuL49Zbb8UwDNLT04mPj/djSpHIpGt8IgEwf/78C55a4HA4eO655yJm3JFIW9GpTpEA2LBhQ6uu7X1RXV0du3fv9lMiEfknFZ9IAJw8edIvn3Ps2DG/fI6I/B8Vn0gA+OtanEYBififik8kAHr16nXB1/ji4uK4+uqr/ZRIRP5Jm1tE/Ki6uprCwkL27duHz+e74M8L9tFGIqFIKz6RC9TQ0MCqVau466676N69O/PmzeOee+6hd+/erf5Mh8NBSkoK3bt392NSEQGt+ERaxefzsXHjRjweD4sXL+a6667D6XTy1FNP0a1bt6bjHnzwQaqqqs778+Pj4/n5z3/uz8gi8jndwC7SQpZl8dZbb+HxeMjLy+Piiy/GMAymTJlCr169vna8z+cjMzOT4uJiampqWvx9OnTowP3338+f/vQnP6YXkX9S8Yl8i3379uHxePB4PNTX1+N0OnE6ndxwww3f+t7a2lqysrIoKSlp0X19UVFRDBs2jLKysgveHCMizdP/WSLNqKio4Omnn2bIkCGMGjWKEydO8NJLL7F//36efPLJFpUeQPv27Vm5ciXTp0/n0ksvpXPnzl87xuFw0KlTJ3r27MmMGTPYv38/ZWVl/v4jicjntOIT+dyJEydYvHgxHo+Hd955h8zMTJxOJ6NHjyYm5sIvh3u9XlatWsWsWbPYv38/NTU1dOzYkQEDBjBt2jRuvvlmHA4HxcXF5ObmsnXrVq688ko//MlE5ItUfBLRzpw5w4oVK/B4PJSWljJmzBgMw2Ds2LG0b9/etlwzZsxg2bJllJSUEBcXZ1sOkXCk4pOIU19fzyuvvIJpmqxevZoRI0bgdDrJyMho9lSkHSzLIicnh4SEBObMmWN3HJGwouKTiOD1eiktLcXj8bBkyRKuv/56DMMgOzv7S8Neg8np06cZOnQoDz/8MPfee6/dcUTChu7jk7BlWRZvvvkmpmmSn5/PZZddhtPpZPv27fTs2dPueN+qc+fOLF26lOTkZAYOHMjQoUPtjiQSFrTik7CzZ88ePB4PpmkCNN1+cP3119ucrHVWrFjBAw88QHl5+ZdujheR1lHxSVj4+OOPycvLw+PxcOTIEaZMmYLT6WTIkCE4HA67412w6dOnU1paSnFxMbGxsXbHEQlpKj4JWcePH2fx4sWYpsmuXbvIysrCMAxuvvlmoqOj7Y7nV16vl/T0dPr06cOsWbPsjiMS0lR8ElJOnz7N8uXL8Xg8bNy4kbS0NJxOJ2PGjAn7bf8nT54kMTGR3/zmNxiGYXcckZCl4pOgV1dXx5o1azBNkzVr1pCcnIxhGEyYMCHiBrXu2LGDlJQU1q5dy/e//32744iEJBWfBCWv18trr72GaZosW7aMAQMG4HQ6yc7O5pJLLrE7nq3y8vJ4/PHHKS8vJyEhwe44IiFHxSdBw7Istm7disfjIT8/nx49euB0Opk8ebIe3fUV06ZNY9euXRQVFYXd9UyRQFPxie12796NaZp4PB5iYmIwDAOn08l1111nd7Sg1djYSGpqKiNHjuR3v/ud3XFEQoqKT2xx8OBB8vLyME2TTz/9lClTpmAYBoMHDw6L2w/awrFjx0hMTOTZZ58lIyPD7jgiIUPFJ23m2LFjLFq0CI/Hw549e8jOzsbpdJKcnKzZc61UXl7OuHHjKC0tpW/fvnbHEQkJKj4JqMrKSpYtW4ZpmmzZsoVx48ZhGAapqam0a9fO7nhhYd68efzXf/0Xb7zxBl26dLE7jkjQU/GJ39XW1rJq1SpM06S4uJjRo0fjdDpJT0+nY8eOdscLS1OnTuXYsWMsXrxYq2eRb6HiE79obGxk/fr1eDweli9fzuDBg3E6nWRlZWnLfRuoq6tj9OjRpKen89hjj9kdRySoqfik1SzLYsuWLZimyaJFi+jZsyeGYTBp0iS6d+9ud7yIU1FRQVJSEvPmzWPMmDF2xxEJWio+OW87d+7E4/Hg8Xho37590+0H1157rd3RIl5ZWRnZ2dls3ryZa665xu44IkFJxSct8uGHHzaV3alTp5pG/Xz/+9/X7QdBZvbs2cydO5dNmzbRoUMHu+OIBB0Vn5zT0aNHWbhwIaZpsn//frKzszEMgxEjRmgDRRCzLIvc3Fy8Xi8LFizQLyYiX6Hiky/57LPPWLp0KaZpUl5ezoQJEzAMg5SUFM2BCyHV1dWMHDmSu+++m4ceesjuOCJBRcUn1NTUsHLlSjweD+vWrSMlJQWn08m4ceN0qiyEHThwgGHDhpGfn8+oUaPsjiMSNFR8EaqhoYF169ZhmiaFhYUMGTIEwzDIzMyka9eudscTPykuLiY3N5etW7fqQd8in1PxRRCfz8emTZvweDwsWrSI3r1743Q6mTRpEpdffrnd8SRAZsyYwbJlyygpKQn7Yb0iLaHiC3OWZbFjxw5M0yQvL4/OnTtjGAZTpkzRdvcIYVkWOTk5JCQkMGfOHLvjiNhOxRem9u/fj8fjwTRNqqurcTqdGIbBgAED7I4mNjh9+jRDhw7l4Ycf5t5777U7joitVHxh5PDhw+Tn5+PxeDhw4ACTJk3CMAyGDRumLe3C3r17SU5OprCwkKFDh9odR8Q2Kr4Qd/LkSQoKCvB4PGzfvp2MjAycTie33norMTExdseTILNixQoeeOABysvL6datm91xRGyh4gtB1dXVFBYWYpomr732GqmpqRiGQVpaGu3bt7c7ngS56dOnU1JSwtq1a3VvpkQkFV+IaGho4NVXX8Xj8bBy5UqGDRuG0+kkMzNTM9jkvHi9XtLT0+nTpw+zZs2yO45Im1PxBTGfz8fGjRsxTZOCggL69OmD0+kkJyeHyy67zO54EsJOnjxJYmIiv/nNbzAMw+44Im1KF4GCjGVZvPXWW3g8HvLy8khISMAwDLZt28bVV19tdzwJExdffDFLliwhJSWFfv36MWjQILsjibQZrfiCxL59+5puP2hsbGyaftC/f3+7o0kYy8vL47HHHmPbtm0aGCwRQ8Vno0OHDjXdflBRUcHkyZNxOp0kJSXp9gNpM9OmTWPXrl0UFRURHR1tdxyRgFPxtbFPP/2UgoICTNNk586dZGRkYBgGo0eP1g8dsUVjYyOpqamMGDGCJ5980u44IgGn4msDZ86cYcWKFXg8HkpLSxk7diyGYTB27Fg9O1GCwrFjx0hMTOSZZ54hMzPT7jgiAaXiC5D6+npeeeUVTNNk9erVjBw5EqfTyZ133knnzp3tjifyNeXl5YwbN47S0lL69u1rdxyRgFHx+ZHX66W0tBSPx8OSJUvo168fhmGQnZ3NpZdeanc8kW81b948/vjHP7J161bdHyphS8V3gSzL4s0338Q0TfLz8+nWrRtOp5PJkyfTs2dPu+OJnLepU6dy9OhRCgoKiIqKsjuOiN+p+Frp3XffxePx4PF4ADAMA6fTqVNEEvLq6uoYPXo06enpPPbYY3bHEfE7Fd95+Pjjj8nLy8M0TY4dO8bkyZMxDIMbb7xRtx9IWKmoqCApKYm5c+cyduxYu+OI+JWK71scP36cRYsW4fF42L17N1lZWRiGQXJysm4/kLBWVlZGdnY2mzdv1tBiCSsqvmacPn2a5cuXY5ommzZtIi0tDafTyZgxY2jXrp3d8UTazOzZs5k7dy6bNm2iQ4cOdscR8QsV3+fq6upYvXo1Ho+HNWvWcPPNN2MYBhMmTKBjx452xxOxhWVZ5Obm4vV6WbBggU7pS1iI6OLzer1s2LABj8fDsmXLGDhwIE6nk4kTJ3LJJZfYHU8kKFRXVzNy5EjuvvtuHnroIbvjiFywiCs+y7LYunUrHo+H/Px8evTogWEYTJ48mR49etgdTyQoHThwgGHDhpGfn8+oUaPsjiNyQSKm+Hbt2tV0+0FsbGzT9IPrrrvO7mgiIaG4uJjc3Fy2bt3KlVdeaXcckVYL6+I7cOAAeXl5eDweTpw4wZQpUzAMg0GDBulahUgrzJgxg6VLl1JaWqrnzErICrviO3bsGIsWLcI0Tfbt20d2djZOp5ObbrpJT6EQuUCWZZGTk0NCQgJz5syxO45IqwS0+M4AC4DVwKdALHAlkAukAP5ac1VWVrJ06VI8Hg9btmxh/PjxGIZBamoqsbGxfvouIgJnb/cZOnQoDz/8MPfee6/dcUTOW0CK7yDwJGdLLwqo+srrnYAuwP8Dfgy05s642tpaioqK8Hg8FBcXc8stt+B0OklPT9f9RiIBtnfvXpKTkyksLGTo0KF2xxE5L34vvjeAMZwtu8ZvObYDMABYA3RtwWc3Njayfv16TNNkxYoVDB48GMMwyMrK4uKLL76g3CJyflasWMEDDzxAeXk53bp1szuOSIv5tfjeAW7i7CnOlooDrgc2A+2bed2yLDZv3ozH42HhwoX06tULp9PJpEmT6N69ux9Si0hrTZ8+nZKSEtauXavLChIy/FZ8DZy9fnfsqy/U1cGPfwxr18KJE3DttfD738MddzQdEg/8EHj+C2/buXMnpmmSl5dHfHw8hmEwZcoUrr32Wn/EFRE/8Hq9pKen06dPH2bNmmV3HJEWifHXBy0Fqpt7obERrroKSkqgZ09YtQomTYKdO6FXLwBqgL8D93/4IUUeD6Zpcvr0aZxOZ9MTVXT7gUjwiY6Oxu12k5iYSGJiIoZh2B1J5Fv5bcV3I7C9pQcPHAi//jVMnNj0pajqauJ/8xtyT5/GMAyGDx+u2w9EQsSOHTtISUmhuLiYQYMG2R1H5Bv5pfj2c3aTSk1LDj56FK6+Gt5+G74ytLW3ZfG+VnYiISkvL4/HHnuMbdu2kZCQYHcckXPyy5LqQ1p4S0JDA7hckJv7tdID+IdKTyRkTZkyhczMTAzDwOv12h1H5Jz8UnxngG9dNvp88IMfQLt28Je/NHtInT/CiIhtZs6cSV1dHdOnT7c7isg5+aX4uvAtT2GxLPi3fzt7mrOgAM6x7bm52xlEJHTExMSQn5/PggULWLp0qd1xRJrll+LrA9R+0wH33w/vvguFhRAff87DdKOCSOi77LLLWLx4Mffddx979uyxO47I1/il+HoAI8/14sGD8Le/nd3Mcvnl0KnT2X/c7i8d1omzjzATkdCXmJjIjBkzyMjIoLKy0u44Il/it9sZXgGyOb+ntnxRJ+ATdLpTJJxMnTqVo0ePUlBQoNuTJGj47W9iKtAdiG7FezsA01DpiYSbZ599liNHjjBjxgy7o4g08euzOg8Bg4CTgK+F74nn7Iii5fixhUUkaFRUVJCUlMTcuXMZO3as3XFE/Ns1VwLlnL3m15LBQB2BO4El/g4iIkGjR48e5OXlkZubywcffGB3HBH/9813gV3A7zlbgJ2+8nosZ09p3gzkA+bnXxOR8JWcnMwvf/lLsrKyqK5u9qm+Im0moBPYLWAD8BpwlLOF14Ozm2CuCdQ3FZGgZFkWubm5eL1eFixYoAfPi20CWnwiIl9UXV3NyJEjufvuu3nooYfsjiMRSsUnIm3qwIEDDBs2jPz8fEaNGmV3HIlA2lMiIm2qV69evPzyyzidTg4dOmR3HIlAKj4RaXOpqan85Cc/YeLEidTV6fH00rZ0qlNEbGFZFjk5OSQkJDBnzhy740gE0YpPRGzhcDiYP38+Gzdu5IUXXrA7jkQQrfhExFZ79+4lOTmZwsJChg4dancciQBa8YmIrfr06cOLL75ITk4OR48etTuORACt+EQkKEyfPp2SkhLWrl1L7DmGVYv4g4pPRIKC1+slPT2dPn36MGvWLLvjSBjTqU4RCQrR0dG43W4KCwsxTdPuOBLGtOITkaCyY8cOUlJSKC4uZtCgQXbHkTCkFZ+IBJWBAwcye/ZssrKyOHHihN1xJAxpxSciQWnatGns2rWLoqIioqOj7Y4jYUQrPhEJSjNnzqSuro7p06fbHUXCjIpPRIJSTEwM+fn5LFiwgKVLl9odR8KITnWKSFArLy8nLS2NsrIy+vbta3ccCQNa8YlIUEtMTGTmzJlkZGRQWVlpdxwJA1rxiUhImDp1KkePHqWgoICoKP3OLq2nvz0iEhKeffZZjhw5wowZM+yOIiFOKz4RCRkVFRUkJSUxd+5cxo4da3ccCVEqPhEJKWVlZWRnZ7N582auueYau+NICNKpThEJKcnJyfzyl78kMzOT6upqu+NICNKKT0RCjmVZ5Obm4vV6WbBgAQ6Hw+5IEkK04hORkONwOHj++efZvXs3zz77rN1xJMRoxSciIevAgQMMGzaM/Px8Ro0aZXccCRFa8YlIyOrVqxcvv/wyTqeTQ4cO2R1HQoSKT0RCWmpqKj/5yU+YOHEidXV1dseREKBTnSIS8izLIicnh4SEBObMmWN3HAlyWvGJSMhzOBzMnz+fjRs38sILL9gdR4KcVnwiEjb27t1LcnIyhYWFDB061O44EqS04hORsNGnTx9efPFFsrOzOXr0qN1xJEhpxSciYWf69OmUlJSwdu1aYmNj7Y4jQUbFJyJhx+v1kp6eznXXXcczzzxjdxwJMjrVKSJhJzo6GrfbzcqVK3G73XbHkSCjFZ+IhK0dO3aQkpJCcXExgwYNsjuOBAmt+EQkbA0cOJDZs2eTlZXFiRMn7I4jQUIrPhEJe9OmTWPXrl0UFRURHR1tdxyxmVZ8IhL2Zs6cSV1dHdOnT7c7igQBFZ+IhL2YmBjy8/NZsGABS5cutTuO2EynOkUkYpSXl5OWlkZZWRl9+/a1O47YRCs+EYkYiYmJzJw5k4yMDCorK+2OIzbRik9EIs7UqVM5evQoBQUFREXp9/9Io+ITkYhTV1fH6NGjSU9P57HHHmv6uuWro7HuID5fNeDD4WhPTLsriYrpYl9Y8TsVn4hEpIqKCpKSkpg7dy6pKUOoq9pGQ81ecESB5QOspn+PbteduE5JxLS7GofDYXd0uUAqPhGJWGVlpWwpnc2994zE4bAA37kPdsQSHfMdOiZk4YiKa7OM4n8qPhGJSJZlUXOqmOrTO4mNaem7oomKvohOl7pwRLULZDwJIF3VFZGIVF+1nYaad8+j9AC8+LynqD65PFCxpA2o+EQk4liWl9ozm4DGr702Z95GRo99hst6Pcr9P81r5t1eGuv/gbfhWMBzSmCo+EQk4jTUvg80f5Xn8ssv4mcP3cZdU5K+4RO81J15MyDZJPBUfCIScerOlIPV0OxrE9IGMP6OG0i4uMM3fIJFQ+1eLF9dYAJKQKn4RCTi+Bo/ufAPcUTjbdSoo1Ck4hORiGJZPr7xtoXz+iyt+EKRik9EIozj838u/HMcjvPaEipBQsUnIhHF4XDgcLS/8A+yvDiiOl3450ibU/GJSMSJ7TAAaH4Se2Ojl9raBrxeC6/XR21tA42N3q8dFxXTleiYroENKgGhJ7eISMTxeU9z+thc4OuF9oc/vcLMp4u/9LVHH0nlFz8b839fcMQSf1Eq7eKvD3BSCQQVn4hEpKpPC2is/4hWbXRxxNGl21Rd4wtROtUpIhEpvusdOKLiOf+NLjF0TMhQ6YUwFZ+IRKSo6A50umQKjqgOtPxHYQwdLk4npt2VgYwmAaZTnSIS0Xy+amorS8/O4gO+/vzOs5tgott1J77LKKJju7VpPvE/FZ+ICGenr9fX7Ka++n+xfDWcncAeR0z77xLXYTBRMRfZHVH8RMUnIiIRRdf4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkoqj4REQkovx/3n7iPMStS/0AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "np.random.seed(0)\n",
    "labeled_neighbors = np.argsort(labeled_distances)[:4]\n",
    "G = generate_neighbor_graph(random_index, labeled_neighbors)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is a tie! We can break that tie by weighing label-votes based on the inverse distance to the unlabeled point.\n",
    "\n",
    "**Listing 20. 10. Weighing votes of neighbors based on distance**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A data point with a label of 2 is 0.54 units away. It receives 1.86 votes.\n",
      "A data point with a label of 1 is 0.74 units away. It receives 1.35 votes.\n",
      "A data point with a label of 2 is 0.77 units away. It receives 1.29 votes.\n",
      "A data point with a label of 1 is 0.98 units away. It receives 1.02 votes.\n",
      "\n",
      "We counted 3.15 votes for class 2.\n",
      "We counted 2.36 votes for class 1.\n",
      "Class 2 has received the plurality of the votes.\n"
     ]
    }
   ],
   "source": [
    "from collections import defaultdict\n",
    "class_to_votes = defaultdict(int)\n",
    "for node in G.neighbors(random_index):\n",
    "    label = G.nodes[node]['label']\n",
    "    distance = distance_matrix[random_index][node]\n",
    "    num_votes = 1 / distance\n",
    "    print(f\"A data point with a label of {label} is {distance:.2f} units \"\n",
    "          f\"away. It receives {num_votes:.2f} votes.\")\n",
    "    class_to_votes[label] += num_votes\n",
    "  \n",
    "print()\n",
    "for class_label, votes in class_to_votes.items():\n",
    "    print(f\"We counted {votes:.2f} votes for class {class_label}.\")\n",
    "    \n",
    "top_class = max(class_to_votes.items(), key=lambda x: x[1])[0]\n",
    "print(f\"Class {top_class} has received the plurality of the votes.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 20. 2 Measuring Predicted Label Accuracy\n",
    "\n",
    "We want to analyze predictions across all the points within `X_test`. We'll define a `predict` function for this purpose.\n",
    "\n",
    "**Listing 20. 11. Parameterizing KNN predictions**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def predict(index, K=1, weighted_voting=False):\n",
    "    labeled_distances = distance_matrix[index]\n",
    "    labeled_neighbors = np.argsort(labeled_distances)[:K]\n",
    "    class_to_votes = defaultdict(int)\n",
    "    for neighbor in labeled_neighbors:\n",
    "        label = y_train[neighbor]\n",
    "        distance = distance_matrix[index][neighbor]\n",
    "        num_votes = 1 / max(distance, 1e-10) if weighted_voting else 1\n",
    "        class_to_votes[label] += num_votes\n",
    "    return max(class_to_votes, key=lambda x: class_to_votes[x])\n",
    "\n",
    "assert predict(random_index, K=3) == 2\n",
    "assert predict(random_index, K=4, weighted_voting=True) == 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets execute `predict` across all unlabeled indices.\n",
    "\n",
    "**Listing 20. 12. Predicting all unlabeled flower classes**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred = np.array([predict(i) for i in range(y_test.size)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to compare the predicted classes with the actual classes in `y_test`. Lets start by printing out both the `y_pred` and the `y_test` arrays.\n",
    "\n",
    "**Listing 20. 13. Comparing the predicted and actual classes**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Predicted Classes:\n",
      "[2 1 0 2 0 2 0 1 1 1 2 1 1 1 2 0 2 1 0 0 2 1 0 0 2 0 0 1 1 0 2 1 0 2 2 1 0\n",
      " 2 1 1 2 0 2 0 0 1 2 2 1 2 1 2 1 1 1 1 1 1 1 2 1 0 2 1 1 1 2 2 0 0 2 1 0 0\n",
      " 1 0 2 1 0 1 2 1 0 2 2 2 2 0 0 2 2 0 2 0 2 2 0 0 2 0 0 0 1 2 2 0 0 0 1 1 0\n",
      " 0 1]\n",
      "\n",
      "Actual Classes:\n",
      "[2 1 0 2 0 2 0 1 1 1 2 1 1 1 1 0 1 1 0 0 2 1 0 0 2 0 0 1 1 0 2 1 0 2 2 1 0\n",
      " 1 1 1 2 0 2 0 0 1 2 2 2 2 1 2 1 1 2 2 2 2 1 2 1 0 2 1 1 1 1 2 0 0 2 1 0 0\n",
      " 1 0 2 1 0 1 2 1 0 2 2 2 2 0 0 2 2 0 2 0 2 2 0 0 2 0 0 0 1 2 2 0 0 0 1 1 0\n",
      " 0 1]\n"
     ]
    }
   ],
   "source": [
    "print(f\"Predicted Classes:\\n{y_pred}\")\n",
    "print(f\"\\nActual Classes:\\n{y_test}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's easier to compare the two arrays if we aggregate them into a **cofusion matrix**.  The matrix rows will track predicted classes, while the columns will track the true class identities. \n",
    "\n",
    "**Listing 20. 14. Computing the confusion matrix**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAD4CAYAAADvsV2wAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAdUUlEQVR4nO3deZhU9Z3v8fenugE1rqAsNyKSiEkcUcyjSKIhiEvADb3I3Bij3rmSVqMmmji5jpnHLdExN1EnUa/aEKOM0WhcRoXghho0KohLcMuAC0EFmghGQbki3d/7Rx1I22m6qps6XafrfF4+v6frLPWr7zkPfutX37MpIjAzs9pXqHYAZmbWPZzwzcxywgnfzCwnnPDNzHLCCd/MLCfqqx1Ad9l8p2N9OlLK1iy+sNohmFXIrtrUHjqTc9YsvmWTP68cHuGbmeVEbkb4ZmbdScreeNoJ38wsBQVlL71mLyIzsxrgEb6ZWU5I3XIctlOc8M3MUuERvplZLrikY2aWE074ZmY54bN0zMxywiN8M7OccMI3M8sJ4dMyzcxyIYsj/OxFZGZWAwqF+rJbRyRtJmmupD9KeknShcn8CyS9Len5pB1aKiaP8M3MUlGx8fRHwNiIWC2pF/C4pJnJsisi4mflduSEb2aWgkqVdCIigNXJZK+kden5Hi7pmJmlQCqU3Ur3pTpJzwPLgQcjYk6y6HRJ8yVdL2m7Uv044ZuZpUAUym9Sg6R5rVpD674iojkiRgA7AiMl7Q5cA3wWGAEsBS4rFZNLOmZmKehMSSciGoHGMtb7q6RHgXGta/eSpgDTS73fI3wzsxQUCnVlt45I2kHStsnrzYGDgD9JGtRqtaOBF0vF5BG+mVkKVLnx9CDgRkl1FAfpt0XEdEn/IWkExQO4i4CTS3XkhG9mloIKnqUzH9irnfnHd7YvJ3wzsxRk8UpbJ3wzsxRUsKRTMU74ZmYpUIlbJlRD9iIyM6sBfoi5mVlOuKRjZpYTPmhrZpYXLumYmeVE9gb4TvhmZqkoZC/jO+FnQJ8+vXjot+fRu3cv6uvruOt3c/jx5bezx25DuPKSk+jTpxfrmls484fXM++Pr1U73Joxe/YzXHzxFFpaWpg06WAaGiZVO6Sak+t9nL18XzokSc3J47NelPRbSVt09cMk3SDpmOT1VEm7dbDuGElf3sgySfqFpFeTe0F/sasxZcFHH33MuK//mH3HncO+487hkK/uyci9duHic7/Bxf9+B6PG/ws/uuy3XHzuN6odas1obm7moouuZerUC5gx42qmT5/Nq68urnZYNSXv+zikslt3Kec7aE1EjIiI3YG1wCmtFyY39Om0iJgcES93sMoYoN2ED4wHhiWtgeJ9oXu0Dz78CIBe9XXU19cREUQEW2+1OQDbbLUFS5verWaINWX+/IUMGTKIwYMH0rt3Lw47bDSzZs0p/UYrW+73sTrRuklnf3Q8BuySjL4fkXQz8ELyNJafSno6GXGfDBtG4ldJelnSDKD/+o4kPSpp7+T1OEnPJg/pnSVpZ4pfLGclvy6+0iaOCcC0KHoK2LbNrUJ7nEJBPDXz31j83HU8/PgLPP38a/zzhdO45NzjWPjUVfzbvx7HeT/5TbXDrBlNTSsYOHD7DdMDBvSjqWlFFSOqPbnfxwWV37pJ2TV8SfUUR9b3JbNGArtHxBvJ01nei4h9JPUB/iDpAYp3ePscMBwYALwMXN+m3x2AKcDopK++EbFS0rXA6o08oPfTwJutpt9K5i0td3uypqUlGDX+X9hm6y24tfF77Lbrjpx03IH84KL/4D9nzmXi4aO45qcNHPaNS6odak0oPib0k7J4ZWRPlvt9nMFtLWeEv3nyLMV5wGLgl8n8uRHxRvL6EOCEZL05QD+K5ZbRwC3J47mWAA+30/8oYPb6viJiZRkxtbcn/+5fV+vHhq1b/WoZ3Vbfe+9/yOynXuGQMXty3MTR/OfMuQDcMf0p9t7zs1WOrnYMHLg9y5a9s2G6qWkF/fv3rWJEtSf3+7hO5bdu0pka/oiIOCMi1ibzP2i1joAzWq03NCIeSJaVerq6ylinrbeAwa2mdwSWtF0pIhojYu+I2Lt+y106+RHdZ/u+W7HN1sVj4Zv16cXY/Xfnv15bwtKmd/nKqC8AMGa/f+DVRcuqGWZNGT58GIsWLeHNN5exdu3HzJgxm7FjR1Y7rJqS+30sld+6SaVOy7wfOFXSwxHxsaRdgbeB2cDJkqZRrN8fANzc5r1PAldLGtq6pAOsArbeyOfdQ/Fp7b8B9qVYTuqx5ZyB/bdjyuWnUldXoFAQd0x/ipmznuO99z/kpxecQH1dHR999DGnnzO12qHWjPr6Os477xQmTz6f5uYWJk48iGHDhlQ7rJqS+32cvYoOaq/O9okVpNURsWWbeWOAsyPi8GS6APwYOILiZv4FOAp4H7gSGAssSN5+U0TcnjyI9+yImCdpPHAJxV8cyyPi4ORL43agheKvh8dafb6Aq4BxwIfAP0XEvI62Y/Odju3srwjrpDWLL6x2CGYVsusmp+th464vO+csvO9/dcvXQ8mEXyuc8NPnhG+1owIJf3wnEv7M7kn4vtLWzCwFUZe9S22zF5GZWS2o0IVXkjaTNDe5TuklSRcm8/tKelDSwuTvdqVCcsI3M0tD5c7S+QgYGxF7AiOAcZJGAecAsyJiGDArme6QE76ZWRoqdKVtckeB1clkr6QFxTsO3JjMv5HiiTIdh9TljTEzs43rREmn9UWiSWv4RFfF29c8DywHHoyIOcCA9aejJ3/7U4IP2pqZpaETF1RFRCPQ2MHyZmCEpG2BuyTt3pWQnPDNzNKQwi0TIuKvyTVM44AmSYMiYmly88jlpd7vko6ZWRoqdNBW0g7JyB5JmwMHAX+ieMeBE5PVTgTuLhWSR/hmZmmo3AB/EHBj8uyRAnBbREyX9CRwm6STKN7YsuTjxJzwzcxSEBW6z31EzKd4q/m281cAB3amLyd8M7M0ZPB++E74ZmZpyF6+d8I3M0tFBu+l44RvZpYGj/DNzHKiGx9OXi4nfDOzNDjhm5nlQ2Qv3zvhm5mlwgdtzcxywiUdM7OcyN4A3wnfzCwVvtLWzCwnXNIxM8uH8AjfzCwn6p3wzczywSN8M7OccA3fzCwnspfvnfDNzNJQqSdeVZITvplZGpzwzcxyos4Jv2rWLL6w2iHUvF1HPlTtEGrek7N3rHYIudBvs103vZMKnaUjaTAwDRgItACNEfFzSRcA3wL+kqx6bkT8rqO+cpPwzcy6VeVKOuuA70fEs5K2Ap6R9GCy7IqI+Fm5HTnhm5mloUIJPyKWAkuT16skvQJ8ukshVSQiMzP7hJDKbpIaJM1r1Rra61PSzsBewJxk1umS5ku6XtJ2pWJywjczS0Odym4R0RgRe7dqjW27k7QlcAdwZkS8D1wDfBYYQfEXwGWlQnJJx8wsDRU8LVNSL4rJ/tcRcSdARDS1Wj4FmF6qHyd8M7M0VCjhSxLwS+CViLi81fxBSX0f4GjgxVJ9OeGbmaWhcgP8/YDjgRckPZ/MOxc4VtIIIIBFwMmlOnLCNzNLQaVurRARj9P+10eH59y3xwnfzCwNvj2ymVlO+NYKZmb5UMjgSe9O+GZmKchgRccJ38wsDU74ZmY5oQxmfCd8M7MUuIZvZpYTcsI3M8uHDFZ0nPDNzNKQwUfaOuGbmaXBI3wzs5xwwjczy4mCb61gZpYPHuGbmeWEE76ZWU444ZuZ5YRPyzQzywmP8M3MciKLZ+lk8G4PZmY9n1R+67gfDZb0iKRXJL0k6bvJ/L6SHpS0MPm7XamYnPDNzFJQqYQPrAO+HxFfAEYBp0naDTgHmBURw4BZyXSHnPDNzFJQqYQfEUsj4tnk9SrgFeDTwATgxmS1G4GjSsXkhG9mloKCym+SGiTNa9Ua2utT0s7AXsAcYEBELIXilwLQv1RMPmhrZpaCQl3560ZEI9DY0TqStgTuAM6MiPe78kQtJ/wMmj37GS6+eAotLS1MmnQwDQ2Tqh1Sj9e7dx03X3cUvXvXUVdX4P5Zr/GLKU/z3ZNHcuDooUQEK1au4ZyLZrH8nQ+rHW7N+O/jL2GLLfpQVyfq6uq4/pbvVjukblPJ0zIl9aKY7H8dEXcms5skDYqIpZIGActL9VMy4UtqBl5I1n0FODEiuvR/hKQbgOkRcbukqcDlEfHyRtYdA6yNiCfaWfZ54FfAF4EfRsTPuhJPFjU3N3PRRdfyq1/9iAED+nHMMd9j7Nh92WWXnaodWo+2dm0zJ3z7bj5cs476ugK3TDma3z+5mKk3PcfPr5sLwPH/OJzTJu/D+Zf+vsrR1parpp7Cttt9qtphdLtKPdNWxY5+CbwSEZe3WnQPcCJwafL37lJ9lVPDXxMRIyJid2AtcEqbYDrxw+VvImLyxpJ9Ygzw5Y0sWwl8B6iZRL/e/PkLGTJkEIMHD6R3714cdthoZs2aU+2wasKHa9YBUF9foL6+QETwwQcfb1i+xea9iIhqhWc1poJn6ewHHA+MlfR80g6lmOgPlrQQODiZ7lBnSzqPAXsko+/zgaXACEnDkw8bA/QBro6I65JvpiuBscAbwIZNk/QocHZEzJM0DrgEqAPeAU6i+MXSLOmbwBkR8dj690bEcmC5pMM6GX/mNTWtYODA7TdMDxjQj/nzF1QxotpRKIi7pk1ipx234de3v8D8l4q/gM86dV+OOvRzrFr9EcefWnKQZJ0g4MxTpiDBhGNGcdQxo6odUrepVEknIh6nVe5s48DO9FX2WTqS6oHxFMs7ACMpllN2o5ig34uIfYB9gG9JGgocDXwOGA58i3ZG7JJ2AKYAEyNiT2BSRCwCrgWuSH5dPNb2fWXGvOHId2PjrV3potu1N8Ks1E/DvGtpCSZ88zZGH34je+w2gGGf6QvAFdfM4atHTOPe+xZy/KThVY6ytlx742nccOuZXHb1ZO689Qmee+b1aofUbSo4wq+YchL+5pKeB+YBiynWkgDmRsQbyetDgBOS9eYA/YBhwGjglohojoglwMPt9D8KmL2+r4hY2cVt+TsR0RgRe0fE3g0N/6NS3aZq4MDtWbbsnQ3TTU0r6N+/bxUjqj2rVq9l7rNv85UvffK4yL33L+CQsZ+pUlS1aYf+2wDQt9+WjB67O6+8uLjKEXWf+kL5rbt0poY/IiLOiIi1yfwPWq0jimWX9esNjYgHkmWliqIqY53cGD58GIsWLeHNN5exdu3HzJgxm7FjR1Y7rB5vu203Y6stewPQp08dXx65I6//+V2GDN5mwzoHjh7K64v+WqUIa8+aD9fywQf/b8PruU8u4DO7DKxyVN2noCi7dZdKnZZ5P3CqpIcj4mNJuwJvA7OBkyVNo3hRwAHAzW3e+yRwtaShEfGGpL7JKH8VsHWF4usx6uvrOO+8U5g8+Xyam1uYOPEghg0bUu2werz+23+Kn5w/lkKhQKEAMx96jUcf/zNXXvo1hg7ZlpYWWLJslc/QqaCVK1fxL2cVLwRtXtfCwYfuxaj9Pl/lqLpPFm+PrFJnJUhaHRFbtpk3huIB18OT6QLwY+AIiiP2v1C8zPd9/nbQdv2Rx5uS0zIf5W8HbcdTPGhbAJZHxMHJl8btQAttDtpKGkixxLR1snw1sFtEvL/xLVngXxEp23XkQ9UOoeY9OXvHaoeQC/02O3KT0/VhDzxeds6Zccj+3fL1UHKE3zbZJ/MeBR5tNd0CnJu0tk7fSL9jWr2eCcxss3wBsMdG3rsM8L98M8us7izVlMtX2pqZpSCLJR0nfDOzFNQ74ZuZ5YNc0jEzyweXdMzMciKLDxtxwjczS4HP0jEzywkftDUzywnX8M3McsIlHTOznPAI38wsJ3yWjplZTrikY2aWE935YJNyOeGbmaUgg/k+kzGZmfV4lXzilaTrJS2X9GKreRdIelvS80k7tGRMm7hNZmbWjoLKb2W4ARjXzvwrWj1a9nelOnFJx8wsBZUcTUfEbEk7b2o/HuGbmaWgMyN8SQ2S5rVqDWV+zOmS5icln+1KxrSJ22RmZu2oK0TZLSIaI2LvVq2xjI+4BvgsMAJYClxW6g0u6ZiZpSDt0XRENK1/LWkKML3Ue5zwzcxSkPaFV5IGRcTSZPJo4MWO1gcnfDOzVFTyXjqSbgHGANtLegs4HxgjaQQQwCLg5FL9OOGbmaWgkgk/Io5tZ/YvO9uPE76ZWQp6+V46Zmb54Nsjm5nlhBO+mVlO1Dnhm5nlg0f4ZmY54QegmJnlRC+P8K2WPf34Z6sdQs3bZ//Xqh1CLiyYu+l9uKRjZpYTLumYmeWEz9IxM8sJl3TMzHKiPoNPG3HCNzNLQZ1r+GZm+ZDBAb4TvplZGlzDNzPLCSd8M7OccA3fzCwnfJaOmVlOZLGkk8HvIDOznq9O5bdSJF0vabmkF1vN6yvpQUkLk7/blerHCd/MLAUFRdmtDDcA49rMOweYFRHDgFnJdMcxdXYjzMystEInWikRMRtY2Wb2BODG5PWNwFGl+nEN38wsBd1Qwx8QEUsBImKppP6l3uCEb2aWgl6F8k/LlNQANLSa1RgRjZWOyQnfzCwFnRnhJ8m9swm+SdKgZHQ/CFheMqZOfoCZmZWhoPJbF90DnJi8PhG4u9QbPMI3M0tBJUfTkm4BxgDbS3oLOB+4FLhN0knAYmBSqX6c8M3MUqAKHrSNiGM3sujAzvTjhG9mloIsXmnrhG9mloIsHiB1wjczS4F8t0wzs3zIYEXHCd/MLA2VPGhbKU74ZmYpyGC+d8I3M0tDObc97m5O+GZmKXBJx8wsJzKY753wzczS4IRvZpYTvtLWyjJ79jNcfPEUWlpamDTpYBoaSt4TybqgubmFE7/+U3bovy1XXH1ytcOpCb1713HzdUfRu3cddXUF7p/1Gr+Y8jTfPXkkB44eSkSwYuUazrloFsvf+bDa4aYqg/m+9NW/kpolPS/pRUm/lbRFVz9M0g2SjkleT5W0WwfrjpH05Y0sO07S/KQ9IWnPrsaUNc3NzVx00bVMnXoBM2ZczfTps3n11cXVDqsm/eamR9l56MBqh1FT1q5t5oRv382Rx93GhONu4ytf2ok9dx/A1Jue48jjbmXCN2/jkccXcdrkfaodauoq/EzbysRUxjprImJEROwOrAVOab1QUl1XPjgiJkfEyx2sMgZoN+EDbwBfjYg9gB/R+QcHZNb8+QsZMmQQgwcPpHfvXhx22GhmzZpT7bBqTtOyd/nDYy8zYeKXqh1KzflwzToA6usL1NcXiAg++ODjDcu32LwXEdm77UClSeW37tLZ+/s8BuySjL4fkXQz8IKkOkk/lfR0Muo+GUBFV0l6WdIMYMMzFyU9Kmnv5PU4Sc9K+qOkWZJ2pvjFclby6+IrrYOIiCci4t1k8ilgxy5tfQY1Na1g4MDtN0wPGNCPpqYVVYyoNl3xf+7kjLOOpJDFQmsPVyiIu2/6R568/5/4w9w3mf9S8UFMZ526L7+/9wSOGDeMn183t8pRpq+SDzGvZExlkVQPjAdeSGaNBH4YEbsBJwHvRcQ+wD7AtyQNBY4GPgcMB75FOyN2STsAU4CJEbEnMCkiFgHXAlckvy4e6yC0k4CZ5W5H1rU38lEWT+jtwR77/Yts13crvvAPO1U7lJrU0hJM+OZtjD78RvbYbQDDPtMXgCuumcNXj5jGvfct5PhJw6scZfp66gh/c0nPA/MoPlXll8n8uRHxRvL6EOCEZL05QD9gGDAauCUimiNiCfBwO/2PAmav7ysiVpYbvKQDKCb8/72R5Q2S5kma19h4a7ndVtXAgduzbNk7G6abmlbQv3/fKkZUe+Y/9zqPPfICE752AT/85xuYN3cB550zrdph1ZxVq9cy99m3+cqXPvnFeu/9Czhk7GeqFFX3USdadynnLJ01ETGi9YxkxPlB61nAGRFxf5v1DgVKFetUxjp//yZpD2AqMD4i2q15fPLBwAt6RNFw+PBhLFq0hDffXMaAAf2YMWM2l112drXDqimnnXkkp515JADPPL2Qm254mIsuPaHKUdWG7bbdjHXrWli1ei19+tTx5ZE70jjtOYYM3oY/v/keAAeOHsrri/5a3UC7QRarhZU6LfN+4FRJD0fEx5J2Bd4GZgMnS5pGsX5/AHBzm/c+CVwtaWhEvCGpbzLKXwVs3d6HSdoJuBM4PiIWVGgbMqG+vo7zzjuFyZPPp7m5hYkTD2LYsCHVDsusLP23/xQ/OX8shUKBQgFmPvQajz7+Z6689GsMHbItLS2wZNkqzr/099UONXVZTPgqdbRc0uqI2LLNvDHA2RFxeDJdAH4MHEFxxP4X4CjgfeBKYCywPjHfFBG3S3o06WOepPHAJRRLTMsj4uDkS+N2oIXir4cNdXxJU4GJwJ+TWesiYu+ON7VnjPB7svfWvlF6Jdsk++z/WrVDyIUFc7+9yel66Yf3lp1zBm1xRLd8PZRM+LXDCT9tTvjpc8LvHpVI+MvW3FN2zhm4+ZEdfp6kRRSrHs2UNcBtn6+0NTNLQQpD9gMi4p3Sq22cE76ZWQqyeDZ1Fh+sbmbW49V1orU+hTxpDW26C+ABSc+0s6xsHuGbmaWgMyP8T55C3q79ImKJpP7Ag5L+FBGzOxuTR/hmZqmo3KVXyYWrRMRy4C6KdzroNCd8M7MUqBP/ddiP9ClJW61/TfHOBi92JSaXdMzMUlC8PKkiBgB3JXc4qAdujoj7utKRE76ZWSoqc5pORLwOVOSZH074ZmYpUAYr5k74ZmYpqGBJp2Kc8M3MUpG9K6+c8M3MUlDq7JtqcMI3M0uBE76ZWU5IddUO4e844ZuZpcIjfDOzXHBJx8wsN3xapplZLniEb2aWE8rgE1Cc8M3MUiB8lo6ZWU54hG9mlgsu6ZiZ5YYTvplZLvj2yGZmueERvplZLhR8P3wzs7xwwjczy4UsXmmbva8gM7OaoE60Ej1J4yT9l6RXJZ3T1Yg8wjczS0GlzsNX8cb6VwMHA28BT0u6JyJe7mxfTvhmZimo4K0VRgKvRsTrAJJ+A0wAnPA3btfsFdRKkNQQEY3VjqNc2/TetdohdFpP28cL5lY7gs7rafu4csrPOZIagIZWsxpb7bNPA2+2WvYWsG9XInINP9saSq9im8j7OH3exyVERGNE7N2qtf6CbO+LI7ryOU74ZmbZ9hYwuNX0jsCSrnTkhG9mlm1PA8MkDZXUG/g6cE9XOspRDb9HymHds9t5H6fP+3gTRMQ6SacD9wN1wPUR8VJX+lJEl0pBZmbWw7ikY2aWE074ZmY54YRfYZKOlhSSPl/GumdK2mITPut/SrqqnfmS9IvkMuz5kr7Y1c/Iqozs589LelLSR5LO7mr/WZWRfXxc8m94vqQnJO3Z1c8wJ/w0HAs8TvFIeilnAl3+n6QD44FhSWsArknhM6otC/t5JfAd4Gcp9J0FWdjHbwBfjYg9gB/hA8CbxAm/giRtCewHnESr/0kk1Un6maQXkpHKGZK+A/w34BFJjyTrrW71nmMk3ZC8PkLSHEnPSXpI0oASoUwApkXRU8C2kgZVdGOrKCv7OSKWR8TTwMcV38gqy9A+fiIi3k0mn6J4Drp1kU/LrKyjgPsiYoGklZK+GBHPUhxlDwX2Sk6x6hsRKyV9DzggIt4p0e/jwKiICEmTgR8A3+9g/fYuxf40sLSL25U1R5GN/VzLjiJ7+/gkYGbXNsfACb/SjgX+PXn9m2T6WeAg4NqIWAcQESs72e+OwK3JKL03xZ+5HanYpdgZlZX9XMsytY8lHUAx4e/fyc+zVpzwK0RSP2AssLukoHiBREj6AcUEXE7Cbb3OZq1eXwlcHhH3SBoDXFCin4pdip01GdvPNSlr+1jSHsBUYHxErChnG6x9ruFXzjEU6+ZDImLniBhMcfSyP/AAcIqkegBJfZP3rAK2atVHk6QvSCoAR7eavw3wdvL6xDJiuQc4ITlbZxTwXkTUSjknS/u5VmVmH0vaCbgTOD4iFmzKRpkTfiUdC9zVZt4dwDcojk4WA/Ml/TGZB8UzDmauP9AFnANMBx7mk/X2C4DfSnoMKFUjBfgd8DrwKjAF+HZnNybDMrOfJQ2U9BbwPeBfJb0laesubVW2ZGYfA+cB/YD/K+l5SfM6vzm2nm+tYGaWEx7hm5nlhBO+mVlOOOGbmeWEE76ZWU444ZuZ5YQTvplZTjjhm5nlxP8HySmx/AmPrIoAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import seaborn as sns\n",
    "def compute_confusion_matrix(y_pred, y_test): \n",
    "    num_classes = len(set(y_pred) | set(y_test))\n",
    "    confusion_matrix = np.zeros((num_classes, num_classes))\n",
    "    for prediction, actual in zip(y_pred, y_test):\n",
    "        confusion_matrix[prediction][actual] += 1\n",
    "    \n",
    "    return confusion_matrix\n",
    "    \n",
    "\n",
    "M = compute_confusion_matrix(y_pred, y_test)\n",
    "sns.heatmap(M, annot=True, cmap='YlGnBu',\n",
    "            yticklabels=[f\"Predict {i}\" for i in range(3)],\n",
    "            xticklabels = [f\"Actual {i}\" for i in range(3)])\n",
    "plt.yticks(rotation=0)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each diagonal element `M[i][i]` tracks the number of accurately predicted instances of _Class i_. Such accurate predictions are commonly called **true positives**.\n",
    "\n",
    "**Listing 20. 15. Counting the number of accurate predictions**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our results contain 104 accurate predictions.\n"
     ]
    }
   ],
   "source": [
    "num_accurate_predictions = M.diagonal().sum()\n",
    "print(f\"Our results contain {int(num_accurate_predictions)} accurate \"\n",
    "       \"predictions.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The fraction of total accurate predictions is referred to as the **accuracy** score. Accuracy can be computed dividing the diagonal sum across the total sum of matrix elements.\n",
    "\n",
    "**Listing 20. 16. Measuring the accuracy score**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our predictions are 92% accurate.\n"
     ]
    }
   ],
   "source": [
    "accuracy = M.diagonal().sum() / M.sum()\n",
    "assert accuracy == 104 / (104 + 9)\n",
    "print(f\"Our predictions are {100 * accuracy:.0f}% accurate.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Errors are present in the output. The model periodically confuses instances of _Classes 1_ and _2_. Lets try to quantify that confusion. First, we'll count the total number of elements that we've predicted as belonging to _Class 1_. \n",
    "\n",
    "**Listing 20. 17. Counting the predicted Class 1 elements**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "We've predicted that 38 elements belong to Class 1.\n"
     ]
    }
   ],
   "source": [
    "row1_sum = M[1].sum()\n",
    "print(f\"We've predicted that {int(row1_sum)} elements belong to Class 1.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We've predicted that 38 elements belong to _Class 1_. 33 predictions are true positives. Meanwhile, the remaining 5 predictions are **false positives**. The ratio `33 / 38` produces a metric called **precision**. A low precision indicates that a predicted class label is not very reliable. \n",
    "\n",
    "**Listing 20. 18. Computing the precision of Class 1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Precision of Class 1 is 0.87\n"
     ]
    }
   ],
   "source": [
    "precision = M[1][1] / M[1].sum()\n",
    "assert precision == 33 / 38\n",
    "print(f\"Precision of Class 1 is {precision:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Errors can also be detected across the confusion matrix columns. Consider for example, _Column 1_. That column tracks all elements in `y_test` whose true label is equal to _Class 1_. \n",
    "\n",
    "**Listing 20. 19. Counting the total Class 1 elements**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "37 elements in our test set belong to Class 1.\n"
     ]
    }
   ],
   "source": [
    "col1_sum = M[:,1].sum()\n",
    "assert col1_sum == y_test[y_test == 1].size\n",
    "print(f\"{int(col1_sum)} elements in our test set belong to Class 1.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "37 elements in our test set belong to _Class 1_. 33 of the elements are true positives. The remaining 4 elements are **false negatives**. The ratio `33 / 37` produces a metric called **recall**. A low recall indicates that our predictor commonly misses valid instances of a class.\n",
    "\n",
    "**Listing 20. 20. Computing the recall of Class 1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Recall of Class 1 is 0.89\n"
     ]
    }
   ],
   "source": [
    "recall = M[1][1] / M[:,1].sum()\n",
    "assert recall == 33 / 37\n",
    "print(f\"Recall of Class 1 is {recall:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A maximum recall of 1.0 is trivial to achieve. We simply need to label each incoming data-point as belonging to _Class 1_. However, this will cause precision to drop drastically. \n",
    "\n",
    "**Listing 20. 21. Checking precision at a recall of 1.0**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Precision at a trivially maximized recall is 0.29\n"
     ]
    }
   ],
   "source": [
    "low_precision = M[1][1] / M.sum() \n",
    "print(f\"Precision at a trivially maximized recall is {low_precision:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We should combine precision and recall into a single metric. However, we can't average these values because they are fractions with different denominators. Fortunately, their numerators are both equal to `M[1][1]`. Thus, we can invert the fractions and then take their average.\n",
    "\n",
    "**Listing 20. 22. Taking the mean of the inverted metrics**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The average of inverted metrics is 1.14\n"
     ]
    }
   ],
   "source": [
    "inverse_average = (1 / precision + 1 / recall) / 2\n",
    "print(f\"The average of inverted metrics is {inverse_average:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The average of the inverses is higher than 1.0. However, both precision and recall have a maximum ceiling of 1.0. We can force the aggregated value to fall within that range by taking an inverse of the average.\n",
    "\n",
    "**Listing 20. 23. Taking the inverse of the inverted mean**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The inverse of average is 0.88\n"
     ]
    }
   ],
   "source": [
    "result = 1 / inverse_average\n",
    "print(f\"The inverse of average is {result:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our aggregated metric is called the **f1-measure** or **f1-score**. Commonly, it's referred to as simply the **f-measure**. The f-measure can be computed more directly by running `2 * precision * recall / (precision + recall)`.\n",
    "\n",
    "**Listing 20. 24. Computing the f-measure of Class 1**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The f-measure of Class 1 is 0.88\n"
     ]
    }
   ],
   "source": [
    "f_measure = 2 * precision * recall / (precision + recall)\n",
    "print(f\"The f-measure of Class 1 is {f_measure:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this instance, the f-measure is equal to the average of the precision and recall. However this is not always the case. Consider a prediction that has one true positive, one false positive, and zero false negatives. \n",
    "\n",
    "**Listing 20. 25. Comparing the f-measure to the average**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Precision: 0.5\n",
      "Recall: 1.0\n",
      "Average: 0.75\n",
      "F-measure: 0.67\n"
     ]
    }
   ],
   "source": [
    "tp, fp, fn = 1, 1, 0\n",
    "precision = tp / (tp + fp)\n",
    "recall = tp / (tp + fn)\n",
    "f_measure = 2 * precision * recall / (precision + recall)\n",
    "average = (precision + recall) / 2\n",
    "print(f\"Precision: {precision}\")\n",
    "print(f\"Recall: {recall}\")\n",
    "print(f\"Average: {average}\")\n",
    "print(f\"F-measure: {f_measure:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The f-measure provides us with a robust evaluation for an individual class. With this in mind, we'll now compute the f-measure for each class within our dataset.\n",
    "\n",
    "**Listing 20. 26. Computing the f-measure for each class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The f-measure for Class 0 is 1.00\n",
      "The f-measure for Class 1 is 0.88\n",
      "The f-measure for Class 2 is 0.88\n"
     ]
    }
   ],
   "source": [
    "def compute_f_measures(M):\n",
    "    precisions = M.diagonal() / M.sum(axis=0)\n",
    "    recalls = M.diagonal() / M.sum(axis=1)\n",
    "    return 2 * precisions * recalls / (precisions + recalls)\n",
    "    \n",
    "f_measures = compute_f_measures(M)\n",
    "for class_label, f_measure in enumerate(f_measures):\n",
    "    print(f\"The f-measure for Class {class_label} is {f_measure:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We computed three f-measures across three different classes. These f-measures can be combined into a single score by taking their mean.\n",
    "\n",
    "**Listing 20. 26. Computing the f-measure for each class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our unified f-measure equals 0.92\n"
     ]
    }
   ],
   "source": [
    "avg_f = f_measures.mean()\n",
    "print(f\"Our unified f-measure equals {avg_f:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The f-measure is identical to our accuracy. However, that f-measure and accuracy are not guaranteed to be the same. The difference between the metrics is especially noticeable when the classes are **imbalanced**. In an imbalanced dataset, there are way more instances of some _Class A_ than of another _Class B_.\n",
    "\n",
    "**Listing 20. 28. Comparing performance metrics across imbalanced data**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The accuracy for our imbalanced dataset is 0.99\n",
      "The f-measure for our imbalanced dataset is 0.83\n"
     ]
    }
   ],
   "source": [
    "M_imbalanced = np.array([[99, 0], [1, 1]])\n",
    "accuracy_imb = M_imbalanced.diagonal().sum() / M_imbalanced.sum()\n",
    "f_measure_imb =  compute_f_measures(M_imbalanced).mean()\n",
    "print(f\"The accuracy for our imbalanced dataset is {accuracy_imb:.2f}\")\n",
    "print(f\"The f-measure for our imbalanced dataset is {f_measure_imb:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Scikit-Learn's Prediction Measurement Functions\n",
    "All the prediction metrics that we've discussed thus far are available in Scikit-Learn. They can be imported from the `sklearn.metrics` module. For instance, we can compute the confusion matrix by importing and running `confusion_matrix`.\n",
    "\n",
    "**Listing 20. 29. Computing the confusion matrix using Scikit-Learn**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[38  0  0]\n",
      " [ 0 33  5]\n",
      " [ 0  4 33]]\n"
     ]
    }
   ],
   "source": [
    "from sklearn.metrics import confusion_matrix\n",
    "new_M = confusion_matrix(y_pred, y_test)\n",
    "assert np.array_equal(new_M, M)\n",
    "print(new_M)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In that same manner, we can compute the accuracy by importing and running `accuracy_score`.\n",
    "\n",
    "**Listing 20. 30. Computing the accuracy using Scikit-Learn**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "assert accuracy_score(y_pred, y_test) == accuracy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also, the f-measure can be computed with the `f1_score` function. Passing `average=None` into the function will return a vector of individual f-measures for each class.\n",
    "\n",
    "**Listing 20. 31. Computing all f-measures using Scikit-Learn**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1.   0.88 0.88]\n"
     ]
    }
   ],
   "source": [
    "from sklearn.metrics import f1_score\n",
    "new_f_measures = f1_score(y_pred, y_test, average=None)\n",
    "assert np.array_equal(new_f_measures, f_measures)\n",
    "print(new_f_measures)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Meanwhile, passing `average='macro'` will return a single average score.\n",
    "\n",
    "**Listing 20. 32. Computing a unified f-measures using Scikit-Learn**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_f_measure = f1_score(y_pred, y_test, average='macro')\n",
    "assert new_f_measure == new_f_measures.mean()\n",
    "assert new_f_measure == avg_f"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 20.3 Optimizing KNN Performance\n",
    "\n",
    "Currently, our `predict` function takes two input parameters: `K` and `weighted_voting`. These parameters must be set prior to training, and will influence the classifier's performance. Data scientists refer to such parameters as **hyperparameters**. Lets try to optimize our classifier's hyperparameters by iterating over all possible combinations of `K` and `weighted_voting`.\n",
    "\n",
    "**Listing 20. 33. Optimizing KNN hyperparameters**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAx70lEQVR4nO3deXxU9dX48c9JCIQkE5YkmLCHXUAEjCAEEFQQcLdajb+nitYiVqti9dG2v2ptH5/SYlurVRGqdf0VsbgXNxBFCFaCIvsSECGAkAVIYliynN8fdxJD1gnMzcxkzvv1mlfm3rn3zsklzJnvLqqKMcaY8BUR6ACMMcYEliUCY4wJc5YIjDEmzFkiMMaYMGeJwBhjwlyrQAfQVImJidqzZ89AhxEStmzZAkD//v0DHIkxJtBWr16dp6pJdb0WcomgZ8+eZGVlBTqMkDB+/HgAPv7444DGYYwJPBH5pr7XrGrIGGPCnCUCY4wJc5YIjDEmzFkiMMaYMGeJwBhjwpwlAmOMCXOWCIwxJsyF3DgC45v8A3soKviW8uNH+efCV9kfO5CKiLr/uU9PiefCQclEREgzR+kor1DW7znMqp0FJHnaMKp3Ap080QGJxZhwZImghTh0MJ/tqz7gePZSOuV9Tu+Kr/EUfgdAxrqbKdK2fF4xgJU6kMyKQWzW7igRVC5HMSDZw10X9OPCQach4m5CUFW27i8mc3semdvz+WxHPkVHy044pt9pcYzuncio3gmck5pAu5goV2MyJpxJqC1Mk5aWpoEcWfz1xlUc3re9md9VQARFQCIAQYHy48c49vVKEnI/o2/ZNlpJBcc0iuzoQRR3Hs3PHltE27axfPL0vfD1MueRn+1csm1HSB1LRc9xLD3Wn//5rIyv80sY3CWen0/sz/j+SX5LCKrKN/klrNyRT+b2fFZuzyOv+DgA3TvGMLp3AqP7JDIytSPfHj5K5vZ8MrfnsWpnAUdLK4gQGNylHaN6JzC6dyJn9+xATGv7DtPccg6WsCu/JNBhhLXO7dvSMzH2pM4VkdWqmlbna5YIfFNRXs5/nv8FI7+ZS4QEzz0r0wh2tO7PoeTRxA88n17DJtA6OgaoZ4qJw3tg56dOUtjxCRTmAKBxyeyOH878vB78u6gvHbv25+eTBpDeJ+GkEsLuAueD/7Pt+azckc++w0cB6ORpQ3of55v+qF4JdOsYU+81jpWVs2bXIW/yyOfL3QcpLVeiIoVh3Towuo+TGIZ2a0/rVtbc5W95xcecpJztlNx2FVgSCLQZ5/bm/ikDTupcSwSn6GDuPnY/818MOZrFqnaTaHfu7a5XnzgUp+5GnTKA93nlzwiJoHP/NKLj2td5dqNzDalCwQ5vYvjU+Vm8H4BvSWRF+QD2dxzBqPMvp2//QQ1GevhIKf/Z4Xxgr9yRT87BIwAkxLbmnF4JnNM7gVG9OtI7Ke6k713J8TJW7TzoVCll57N+72FUIaZ1JGf37Mjo3gmk90nk9JR4IgPU3hHKCo+W8p8dBVX3d8v+IgA80a04p1cCo3snMCA5Hru1gdO5fdsGvzw1xBLBKdi8ajEd/j2d9lrIV2f8krOvvAuJCI1vn02edE4V8rbBzmWU71jG8exltC09CMAn5UP4Y9m1bNCeDV6ifUwUI1M7MqpXAqN6J9KvUyzy9cew9PdwvBjGzITBP4CIyJP+vSodLin1Vjc531izDxSf8jUrtYoQhnRtR3qfREb3TmR4j/a0aXXqMTe3o6XlvLjyG55etr2qOq4xbVpFOIm1TwLpvRMZ1DmeVpGh8Tdv6meJ4CRoRQX/mf8wZ235C7kRiZRc/ix9zhzj+vv60ynPPqrK0T3r2fbpAvpuf57ossNkd7qQ1b1+SmFM9xMOjY6KYHiPDpyeHP9976Pdq2DJQ05Jo103iG4H+9dD0gAY/ws4/VLwY1LdX3iUzO15fJ136lUYR0vLWbWzgLU5hymvUKKjvB+OvRNJ75PAoM7tgrrUUVpewatZOTy2ZBvfFh5lbN9EhnXv0OA5bVpFMLx7h5BNeqZhlgiaqPBQPtnzpjH8u2V8GTOaXj95kXYdEl19Tzf4dRrqo4dhxWPw2ZNQfhyGXw/n3gee5NrH7t8IH/0OtiyCmEQYdy+k3QgRUbDpTVj6v5C3FZLPgAm/gn6ToVmq2pquyFtdsqJGdUm7tlGc06sjY/okMrpPIr0SY32u8qqoULbsL2JFdh5rcw4zvn8Slw3t4pfEUlGhvL12L3/5cCs780sY1r09/33hAEb1Tjjla5vQZomgCbav+4w2r00juWI/WX3vYOR1D4ZMVVBNrqxHULQfls2G1f9wPtjPuRXS74S27aHga/j497B2AbTxQPodMPJWaBN34jUqymHdv5xjD34NXc5yEkLv84I2IVTKLTpWVYe+PDuPPYectpCUdtGM7p3ImL5OdUqn+BPHQewuKGFFdh4rvI2v+d851TQdYqI4WFJK/9M83HNhfy44vdNJtaGoKku3HGD2+1vZtK+QAcke7pnUn/NP8nqm5bFEAGzOWsJ3n/yt4YO0gkFFKyiSOHInz2HgOZNPMsrg4OrCNAU7nG/2616F6PbQazxsfsdJDiNvcZJDTMeGr1FeCl/9Ez75IxzeDd1Hw1nToNe5dZc0goyqsqughOXZTmJYsT2PQyWlAPTtFEd6n0SOlVWwIjuvqsdNkqcNY/okkt7HqWI6zRPNovX7+NMHW/k67zuGd2/PfZMHMLKXb9/gC4+Wkpmdz98/3UHWNwfpkRDD3RP7ccmQzgEbIGiCkyUC4Kulr9Jx2QONHpcX04tuP5pDYnK3kwkvqDTLCmX71jrVQDs+hmE/cqqB4lOado2yY/DFC/Dpn6Bon7MvaQCknuskhZ5jnPaFIFdRoWzcV8iK7DyWZzvjIFpFRHBOrwTS+yQwpk8ifTrV3WuqtLyCf63O4dHFW9lfeIzx/ZO498L+DOp84u99vKyCL3cdZEV2Hp96q5bKK5ROnjbccX5frjm7G1HWsGvqYIkgTIXcUpUVFfDtWvj6EyexfLMSyo44g+g6D3eSQu/zocfooK9CAufDXaBJPW6OlpbzfOZOnvx4O4ePlHLJmZ25bkR3Nuw9zPLsPP6zo4AjpeVECJzZrT1jvaWLYd072FgK06CAJQIRmQz8FYgE/q6qs2q83gF4FugNHAVuUtX1DV3TEoHvQi4R1FR2DHJWOQPfdnwMe1aDlsPQ/4KL/gRRLXc+osNHSpm3bAfPLP+aI6XlAPROiq2qVjqndwLx0TbthvFdQBKBiEQCW4GJQA6wCshQ1Y3VjpkNFKvqQyIyAHhCVc9v6LqWCHwX8omgpqOFkPk4LPuj08D8wxehXZdAR+WqA0VHWb3zIGd2a0/n9m0DHY4JYQ0lAjfLkiOAbFXdoarHgfnAZTWOGQgsAVDVzUBPETnNxZhMKIuOh/N+Bde8DLlbYO542PVZoKNyVSdPNFPOSLEkYFzlZiLoAuyutp3j3VfdV8CVACIyAugBdK15IRGZLiJZIpKVm5vrUrgmZJx+Mdy8xOmW+tzFkPVsoCMyJqS5mQjqas2rWQ81C+ggImuAnwFfAmW1TlKdq6ppqpqWlJTk90BNCOo0AH6y1Om2+s5MePtOp02hMcUHYONbcGh348eGk2PFsHkRhFjnEeMfbs7lmwNU74PZFdhb/QBVLQRuBBCnT93X3ocxjWvbHq57BT76H1j+Z2dE8zUvnjgGoXAffLMCdi53HvnbnP1dz4YffxgSvY9cd2g3/DMD9q+DWz6FlCGBjsg0MzcTwSqgr4ikAnuAa4Hrqh8gIu2BEm8bws3AMm9yMMY3EZFwwYPOh9cbP4Wnz4Vx98C365wP/gLv2hFt4qH7KBj+I6fR+dNHIHsx9J0Y2PgDbfcqmH8dlOQ724V7LBGEIdcSgaqWicjtwPs43UefVdUNIjLD+/oc4HTgBREpBzYCP3YrHtPCDboCEvo6H2qL7oE27ZzxBmk3OgPSkod8P+Np2XFYtwCWPgx9LgjfUsG6fznJMz4FrnwaXrwCCvc2fp5pcVxd5klVFwGLauybU+35SqCvmzGYMJI8GH66Eg7tgsR+9U913aq1M2Hem7fBlndhwNTmjTPQKiqceZ6W/RF6pDvdcKPbOQP3ir4NdHQmAGy9P9OytI6FTqc3ftyQa2HZI858Sf0m+3U6bL+ovmjQ/g2gFQ0f70l2Rl2nDG34dzleAm/cChvfcAbmXfwXJzECxHb6fooPE1YsEZjwFNkKxt8Pr9/iTJY38NLAxqPqzMT69affN2wXeatp2sRDZAOjiFXhSIHTaB6b5CSEPhc4s7nGVpu8rnCv0yi87yuY9D8w6vYTq8U8yZYIwpQlAhO+zrjaKRV8/HsYcHHzlwpUYcNrsPUD54Pfu340sZ2cdo3UsdBzLCT0abwdozgXtn8E2R/Ctg9g7XxAoMtw6DPRKSW9dz8cK4KM+dC/jpl14zs71Wom7FgiMOErItIpFSz8MWx83VlCszkt/4uzgltMoveDf6bzwZ/Yr+kN2HFJcOY1zqOiHPaucXpFZX8In/wBUGeVuJved9pS6uJJbvEjtU3dLBGY8DboCm+pYBYMvNwvayn7ZP1CJwkMvgqunOff0khEJHQ9y3mMvw9KCiAnC7qmNbxGhCfFqWIqOwat2vgvHhP0gqyFzJhmVlkqyNvqdKdsDrs+g9dvdRbiufxJ96ukYjpCv0mNLxTk8a4jYT2Hwo4lAmNOvxROOwM+mQXltWY48a/87U6DbbuucO3LwfXNuyoRWINxuLFEYExEBEz4hdNdc+0r7r1PSQG8fLVT//9/Xm38G3pzi7dEEK4sERgD0H+q0wf/kz84ayn7W+lRZ9Tz4Ry49p+Q0Nv/73GqKksEhZYIwo0lAmPA+ZY+4Vdw6BtY87J/r11R4Yxi3rUSrpgD3Uf69/r+0rYDRLaxEkEYsl5DxlTqOxG6pDm9iM7MqLv+/tAu2Po+bH3P+Xbfazz0u9CZqqG++v6lD8P6f8H5D8LgK139FU6JiHdQmTUWhxtLBMZUEoEJv4SXroQvXoARP3H65OdkOR/8W9+HAxucYzv2gvY9IOsf8J850DrOGcnbb7KTUOI6Ocd98aIz0+nw62HMzMD9br7ypFiJIAxZIjCmut7nQbdznFJBTpYzSvdIAUikM5vppIedD/vEPs7xx7+Dr5d9nyg2vYUzovcs6DYSPn/aueZFfw6NWU7jU5wpvE1YsURgTHUicN7/hecvhm3vQ99JTtVP7/OdhXBqah0L/ac4D1X4du33VUefPQGdBsHVzzU8V1Aw8aTAtg8DHYVpZpYIjKkpdSzcvQniTmvaSGMRSDnTeZz73/BdHkS1dZJFqPAkw/FiZ/Ge6PhAR2OaifUaMqYu8Z1PfbqJ2MTQSgIAns7OT2swDiuuJgIRmSwiW0QkW0Tur+P1diLytoh8JSIbRORGN+MxxjSicr3nIlupLJy4lghEJBJ4ApgCDAQyRGRgjcNuAzaq6pnAeOBPItLarZiMMY2ItxJBOHKzRDACyFbVHd7F6ecDl9U4RgGPiAgQBxQALk/2YoypV9xpzk/rQhpW3EwEXYDd1bZzvPuq+xvOAvZ7gXXAnaq11+QTkekikiUiWbm5uW7Fa4xpE+esiGbTTIQVNxNBXZ2mtcb2hcAaoDMwFPibiNTqqqCqc1U1TVXTkpKS/B2nMaY6G1QWdtxMBDlAt2rbXXG++Vd3I/CaOrKBr4EBLsZkjGmMrV0cdtxMBKuAviKS6m0AvhZ4q8Yxu4DzAUTkNKA/sMPFmIwxjYnvbI3FYca1AWWqWiYitwPvA5HAs6q6QURmeF+fA/wOeE5E1uFUJd2nqnluxWSM8UHlxHMVFe6vnmaCgqsji1V1EbCoxr451Z7vBSa5GYMxpok8KVBRCiX5EGdtcuHA0r0x5kS2ZGXYsURgjDmRJYKwY4nAGHMiW7s47FgiMMacqGp0sfUcCheWCIwxJ4qMgtgkKLSJ58KFJQJjTG2eFCsRhBFLBMaY2jwpNhV1GLFEYIypLd5KBOHEEoExpjZPCnyXC+WlgY7ENANLBMaY2qpWKrNSQTiwRGCMqc3WLg4rlgiMMbXZ2sVhxRKBMaY2W7s4rFgiMMbU1rYjRETZNBNhwhKBMaa2iAinesjWLg4LlgiMMXWztYvDhquJQEQmi8gWEckWkfvreP1eEVnjfawXkXIR6ehmTMYYH9naxWHDtUQgIpHAE8AUYCCQISIDqx+jqrNVdaiqDgV+AXyiqgVuxWSMaQJbuzhsuFkiGAFkq+oOVT0OzAcua+D4DOCfLsZjjGkKTzIcK4RjxYGOxLjMzUTQBdhdbTvHu68WEYkBJgML63l9uohkiUhWbm6u3wM1xtShaqUyKxW0dG4mAqljn9Zz7CXAivqqhVR1rqqmqWpaUpItpm1Ms7AlK8OGm4kgB+hWbbsrUN8wxWuxaiFjgoslgrDhZiJYBfQVkVQRaY3zYf9WzYNEpB1wLvCmi7EYY5rK1i4OG63curCqlonI7cD7QCTwrKpuEJEZ3tfneA+9AvhAVb9zKxZjzElo44HWcdZGEAZcSwQAqroIWFRj35wa288Bz7kZhzHmJHmSbe3iMGAji40x9bO1i8OCJQJjTP1s7eKwYInAGFO/yrWLtb6e36YlsERgjKmfJwXKj8ORg4GOxLjIEoExpn6VK5VZg3GL5nMiEJEeInKB93lbEfG4F5YxJijY2sVhwadEICI/Af4FPO3d1RV4w6WYjDHBomrtYhtU1pL5WiK4DUgHCgFUdRvQya2gjDFBwhJBWPA1ERzzTiUNgIi0ov4J5IwxLUWrNhCTYImghfM1EXwiIr8E2orIROBV4G33wjLGBA1Piq1d3ML5mgjuA3KBdcAtONNG/F+3gjLGBBFbu7jFa3SuIRGJANaq6mBgnvshGWOCiicZvl0X6CiMixotEahqBfCViHRvhniMMcEmvjN8dwDKywIdiXGJr7OPpgAbRORzoGq6aFW91JWojDHBw5MMWuEkg/jOgY7GuMDXRPCQq1EYY4JX5UplhfssEbRQPiUCVf3E7UCMMUHKlqxs8XwdWVwkIoXex1ERKReRQh/OmywiW0QkW0Tur+eY8SKyRkQ2iIglHGOCjSWCFs/XEsEJ8wqJyOXAiIbOEZFI4AlgIs5C9qtE5C1V3VjtmPbAk8BkVd0lIjZa2ZhgE5sEEmmJoAU7qdlHVfUN4LxGDhsBZKvqDu+o5PnAZTWOuQ54TVV3ea974GTiMca4KCLCaTC2iedaLJ9KBCJyZbXNCCCNxqeY6ALsrradA4yscUw/IEpEPgY8wF9V9YU63n86MB2ge3frxWpMs7O1i1s0X3sNXVLteRmwk9rf7muSOvbVTB6tgLOA84G2wEoR+UxVt55wkupcYC5AWlqazXFkTHPzpED+9kBHYVziaxvBjSdx7RygW7XtrkDNrxQ5QJ6qfgd8JyLLgDOBrRhjgocnBXYuD3QUxiW+9hr6o4jEi0iUiCwRkTwR+a9GTlsF9BWRVBFpDVwLvFXjmDeBsSLSSkRicKqONjX1lzDGuCw+BY4egtIjgY7EuMDXxuJJqloIXIzzLb4fcG9DJ6hqGXA78D7Oh/sCVd0gIjNEZIb3mE3Ae8Ba4HPg76q6/qR+E2OMe6wLaYvmaxtBlPfnVOCfqlogUlcTwIlUdRHOTKXV982psT0bmO1jHMaYQKhau3gfdOwV2FiM3/maCN4Wkc3AEeCnIpIEHHUvLGNMUKlau9hKBC2RT1VDqno/MApIU9VSnInnGus1ZIxpKaqWrLSxBC2RryUCcMYFTBSR6Gr7avX5N8a0QNHtICrGSgQtlK8Dyh4ExgMDcer8pwDLsURgTHgQ8Y4utkTQEvnaa+gqnEFf33rHFJwJtHEtKmNM8LG1i1ssXxPBEe9KZWUiEg8cAKzrgDHhJL4L5G2F4981fqwJKb4mgizvTKHzgNXAFzj9/o0x4SLtRijJg6X/G+hIjJ/5OsXET71P54jIe0C8qq51LyxjTNDpMRrSboLPnoRBV0LXswIdkfETX6eYEBH5LxF5QFV3AodEpMH1CIwxLdAFD0FcMrx1O5QdD3Q0xk98rRp6EmccQYZ3uwhn0RljTDiJjoeL/wwHNsKKRwMdjfETXxPBSFW9De9oYlU9CLR2LSpjTPDqPwUG/wCWzYYDmwMdjfEDXxNBqXfpSQXwTjFR4VpUxpjgNvkP0DoW3voZVJQHOhpzinxNBI8BrwOdRORhnMFk1nXAmHAVl+Qkg5zPYdXfAx2NOUW+9hp6WURW4wwqE+By7xTSxphwNeSHsO5VWPyQU13U3paRDVVNWbx+P/ApkAm0FZHh7oRkjAkJIk7DMcA7M0FtFdlQ5etcQ78DpgHb+X7dYQXOcycsY0xIaN8dLngQ3v1vWLsAzrwm0BGZk+BrieCHQG9VHa+qE7yPRpOAiEwWkS0iki0i99fx+ngROSwia7yPB5r6CxhjAuzsm6HbSHjvPijODXQ05iT4mgjWA+2bcmFvL6MncGYqHQhkiMjAOg79VFWHeh+/bcp7GGOCQEQkXPq4MwfRe/cFOhpzEnxNBL8HvhSR90XkrcpHI+eMALJVdYeqHgfmY4vZGNMyJfWHcffC+oWw9YNAR2OayNeFaZ4H/gCsw/fxA12A3dW2c4CRdRw3SkS+AvYC96jqhpoHiMh0YDpA9+7WM8GYoJR+F/xnDmx6E/pNCnQ0pgl8TQR5qvpYE69d1+r2NbsVfAH0UNViEZkKvAH0rXWS6lxgLkBaWpp1TTAmGLVq7TQe23KWIcfXqqHVIvJ7ERklIsMrH42ckwN0q7bdFedbfxVVLVTVYu/zRUCUiCT6GrwxJsjY4jUhydcSwTDvz3Oq7Wus++gqoK+IpAJ7gGuB66ofICLJwH5VVe9sphFAvo8xGWOCjScFdn0W6ChME/k6snhCUy+sqmUicjvwPhAJPKuqG0Rkhvf1OThLYN4qImXAEeBaVRuVYkzI8qTAkQIoPQpR0YGOxvjI1xJBFRF5R1Uv9uVYb3XPohr75lR7/jfgb02NwRgTpOJTnJ/F30KHngENxfiuKVNMVOri9yiMMS2DJ9n5ae0EIeVkEsGXfo/CGNMyeDo7P4ssEYSSBhOBiNTqtK+qN7kXjjEmpFVWDVkiCCmNlQjeqHwiIgvdDcUYE/Ki20OraCjc2+ihJng0lgiqDwrr5WYgxpgWQMTpOWSDykJKY4lA63lujDF186RY1VCIaaz76JkiUohTMmjrfY53W1U13tXojDGhJz4F9lqfklDSYCJQ1cjmCsQY00J4UqBwkbNimdQ15ZgJNifTfdQYY+rnSYGyI3D0cKAjMT6yRGCM8S/rQhpyLBEYY/zL400E1oU0ZFgiMMb4V2UisC6kIcMSgTHGvyrnGyqyEkGosERgjPGvqLbQtoOVCEKIJQJjjP/ZSmUhxRKBMcb/bHRxSHE1EYjIZBHZIiLZInJ/A8edLSLlInKVm/EYY5pJvCWCUOJaIhCRSOAJYAowEMgQkYH1HPcHnCUtjTEtgScFivdDRXmgIzE+cLNEMALIVtUdqnocmA9cVsdxPwMWAgdcjMUY05w8KaAVUGz/rUOBm4mgC7C72nYONZa5FJEuwBXAHBogItNFJEtEsnJzc/0eqDHGz6rGElgX0lDgZiKoa7apmlNZPwrcp6oNlh9Vda6qpqlqWlJSkr/iM8a4Jd4GlYWSxqahPhU5QLdq212Bml8P0oD54sxQmAhMFZEyVX3DxbiMMW6zaSZCipuJYBXQV0RSgT3AtcB11Q9Q1dTK5yLyHPCOJQFjWoDYJJBIKxGECNcSgaqWicjtOL2BIoFnVXWDiMzwvt5gu4AxJoRFRELcadaFNES4WSJAVRcBi2rsqzMBqOo0N2MxxjQzG0sQMmxksTHGHTbNRMiwRGCMcYcnxbqPhghLBMYYd8SnOMtVHi8JdCSmEZYIjDHu8NiSlaHCEoExxh22UlnIsERgjHGHlQhChiUCY4w74i0RhApLBMYYd7SJh6hY60IaAiwRGGPcIeIsZG9dSIOeJQJjjHviO1tjcQiwRGCMcY8n2WYgDQGWCIwx7vGkOCUCrbkUiQkmlgiMMe7xpED5MThyMNCRmAZYIjDGuMe6kIYESwTGGPd4Ojs/rQtpULNEYIxxjyfZ+WldSIOaq4lARCaLyBYRyRaR++t4/TIRWSsia0QkS0TGuBmPMaaZVSUC60IazFxboUxEIoEngIk4C9mvEpG3VHVjtcOWAG+pqorIEGABMMCtmIwxzaxVG4hJsC6kQc7NEsEIIFtVd6jqcWA+cFn1A1S1WLWqX1ksYH3MjGlpPDaoLNi5mQi6ALurbed4951ARK4Qkc3Av4Gb6rqQiEz3Vh1l5ebmuhKsMcYlNs1E0HMzEUgd+2p941fV11V1AHA58Lu6LqSqc1U1TVXTkpKS/BulMcZd8SlWIghybiaCHKBbte2uQL1fC1R1GdBbRBJdjMkY09w8naH4AJSXBjoSUw83E8EqoK+IpIpIa+Ba4K3qB4hIHxER7/PhQGsg38WYjDHNzZMMKBTvD3Qkph6u9RpS1TIRuR14H4gEnlXVDSIyw/v6HOAHwPUiUgocAa6p1nhsjGkJ4r2Dyoq+hXZdAxuLqZNriQBAVRcBi2rsm1Pt+R+AP7gZgzEmwCrHElgX0qDlaiIw4aO0tJScnByOHj0a6FBMAEVHR9O1a1eioqK+3+mpViIwQckSgfGLnJwcPB4PPXv2xNvsY8KMqpKfn09OTg6pqanfvxCTABFR1oU0iNlcQ8Yvjh49SkJCgiWBMCYiJCQk1C4VRkR4xxJYiSBYWSIwfmNJwNT7N+BJsTaCIGaJwBjjPk+yrUkQxCwRmBZh5syZPProo1XbF154ITfffHPV9s9//nP+/Oc/13v+Aw88wOLFixt8j9/85jc88sgjtfYfOnSIJ598sskx13W9jz/+mFGjRp2wr6ysjNNOO419++r+IP3444/JzMys2p4zZw4vvPBCk+NxlS1iH9QsEZgWYfTo0VUfhhUVFeTl5bFhw4aq1zMzM0lPT6/3/N/+9rdccMEFJ/XeJ5sI6jJu3DhycnLYuXNn1b7FixczePBgUlJS6jynZiKYMWMG119/vV/i8RtPMhwrhGPFgY7E1MF6DRm/e+jtDWzcW+jXaw7sHM+Dlwyq9/X09HRmzpwJwIYNGxg8eDD79u3j4MGDxMTEsGnTJoYNG8bq1au5++67KS4uJjExkeeee46UlBSmTZvGxRdfzFVXXcWiRYu4++67SUxMZPjw4ezYsYN33nkHgI0bNzJ+/Hh27drFXXfdxR133MH999/P9u3bGTp0KBMnTmT27NnMnj2bBQsWcOzYMa644goeeughAB5++GFeeOEFunXrRlJSEmedddYJv0dERARXX301r7zyCvfddx8A8+fPJyMjg4KCAm666SZ27NhBTEwMc+fOJT4+njlz5hAZGclLL73E448/zpIlS4iLi+Oee+5h/PjxjBw5kqVLl3Lo0CGeeeYZxo4dS0lJCdOmTWPz5s2cfvrp7Ny5kyeeeIK0tDS//rtVqd6FtE0fd97DnDRLBKZF6Ny5M61atWLXrl1kZmYyatQo9uzZw8qVK2nXrh1DhgxBRPjZz37Gm2++SVJSEq+88gq/+tWvePbZZ6uuc/ToUW655RaWLVtGamoqGRkZJ7zP5s2bWbp0KUVFRfTv359bb72VWbNmsX79etasWQPABx98wLZt2/j8889RVS699FKWLVtGbGws8+fP58svv6SsrIzhw4fXSgQAGRkZTJ8+nfvuu49jx46xaNEi/vKXv/DAAw8wbNgw3njjDT766COuv/561qxZw4wZM6o++AGWLFlywvXKysr4/PPPWbRoEQ899BCLFy/mySefpEOHDqxdu5b169czdOhQ//6D1FR9pbJESwTBxhKB8buGvrm7KT09nczMTDIzM7n77rvZs2cPmZmZtGvXjtGjR7NlyxbWr1/PxIkTASgvL69V3bJ582Z69epV1Q8+IyODuXPnVr1+0UUX0aZNG9q0aUOnTp3Yv7/2/DkffPABH3zwAcOGDQOguLiYbdu2UVRUxBVXXEFMTAwAl156aZ2/x9lnn01xcTFbtmxh06ZNnHPOOXTo0IHly5ezcOFCAM477zzy8/M5fPhwo/flyiuvBOCss86qqnJavnw5d955JwCDBw9myJAhjV7nlMTboLJgZonAtBiV7QTr1q1j8ODBdOvWjT/96U/Ex8dz0003oaoMGjSIlStX1nuNxqa6atOmTdXzyMhIysrK6rzGL37xC2655ZYT9j/66KM+d7G99tprmT9/Pps2baoqldQVmy/Xq4y5erzNPqWXTTMR1Kyx2LQY6enpvPPOO3Ts2JHIyEg6duzIoUOHWLlyJaNGjaJ///7k5uZWJYLS0tITGpQBBgwYwI4dO6q+Ob/yyiuNvq/H46GoqKhq+8ILL+TZZ5+luNhpGN2zZw8HDhxg3LhxvP766xw5coSioiLefvvteq+ZkZHBSy+9xEcffVRVchg3bhwvv/wy4DQQJyYmEh8fX+v9fTFmzBgWLFgAOO0e69ata9L5TdbGA6091oU0SFmJwLQYZ5xxBnl5eVx33XUn7KtsGAb417/+xR133MHhw4cpKyvjrrvuYtCg76uy2rZty5NPPsnkyZNJTExkxIgRjb5vQkIC6enpDB48mClTpjB79mw2bdpU1Q00Li6Ol156ieHDh3PNNdcwdOhQevTowdixY+u95sCBA4mJieGss84iNjYWcLqb3njjjQwZMoSYmBief/55AC655BKuuuoq3nzzTR5//HGf7tVPf/pTbrjhBoYMGcKwYcMYMmQI7dq18+nckxafYokgSEmozfqclpamWVlZgQ4jJIwfPx5wvj26bdOmTZx++umuv09zKC4uJi4uDlXltttuo2/fvlU9klqK8vJySktLiY6OZvv27Zx//vls3bqV1q1bn/K16/1beP4SKD0KN394yu9hmk5EVqtqnd3CrERgTA3z5s3j+eef5/jx4wwbNqxWXX9LUFJSwoQJEygtLUVVeeqpp/ySBBrk6QzfZDZ+nGl2lgiMqWHmzJktrgRQk8fjodlL1pXTTFRUOBPRmaDh6r+GiEwWkS0iki0i99fx+v8RkbXeR6aInOlmPMaYAIrvDBWlcKQg0JGYGlxLBCISCTwBTAEGAhkiMrDGYV8D56rqEOB3wFyMMS2TdSENWm6WCEYA2aq6Q1WPA/OBy6ofoKqZqnrQu/kZYAuaGtNS2UplQcvNRNAF2F1tO8e7rz4/Bt6t6wURmS4iWSKSlZub68cQjTHNJt47ittWKgs6biaCuoY81tlXVUQm4CSC++p6XVXnqmqaqqYlJSX5MUTTUuzcuZPBgwefsK++aaP9ZfTo0Y0e07NnT/Ly8mrtrzljqK/qut60adN4+umnT9j3xhtvMHXq1Hqv8+ijj1JSUlK1PXXqVA4dOtTkeJok7jRAoNDGEgQbNxNBDtCt2nZXoNZXAREZAvwduExV812Mxxi/OpkP8konmwjqkpGRwfz580/YVzljaX1qJoJFixbRvn17v8RTr8goiE2yQWVByM3uo6uAviKSCuwBrgWuq36AiHQHXgN+pKpbXYzFNKd374dv/TxlQfIZMGXWSZ9e33TMU6dOZdasWVUjbK+44goeeOABfv3rX9OjRw9uvvnmeqeUjouLo7i4mIqKCm6//XY++eQTUlNTqaio4KabbuKqq64C4PHHH+ftt9+mtLSUV199lejo6FpTRw8YMIAZM2awa9cuwPmgTk9PJz8/n4yMDHJzcxkxYkSdcwRdcMEFTJs2jX379pGSkkJJSQmLFy9m3rx5LFmyhHvuuYeysjLOPvtsnnrqKZ5++mn27t3LhAkTSExMZOnSpfTs2ZOsrCyKi4uZMmUKY8aMITMzky5duvDmm2/Stm1bVq1axY9//GNiY2MZM2YM7777LuvXr2/aP4StVBaUXCsRqGoZcDvwPrAJWKCqG0RkhojM8B72AJAAPCkia0TEhgwb11ROx/zoo49WfZiPGzeOTz/9lMLCQlq1asWKFSsAZ3bOsWPHnjCl9Jo1a1i9ejXLli074bqvvfYaO3fuZN26dfz973+vNaldYmIiX3zxBbfeeiuPPPIIPXv2ZMaMGcycOZM1a9YwduxY7rzzTmbOnMmqVatYuHBh1epqDz30EGPGjOHLL7/k0ksvrUoU1UVGRnLllVdWzR301ltvMWHCBKKiopg2bRqvvPIK69ato6ysjKeeeoo77riDzp07s3TpUpYuXVrretu2beO2225jw4YNtG/fvmrG0xtvvJE5c+awcuVKIiMjT+4fIb6zJYIg5OqAMlVdBCyqsW9Otec3AzfXPM+EuFP45n6y6puFs/r+uqZjHjt2LI899hipqalcdNFFfPjhh5SUlLBz50769+/PvHnz6pxSety4cVXXXb58OVdffTUREREkJyczYcKEE2Ko/r6vvfZanXEuXryYjRs3Vm0XFhZSVFTEsmXLqs656KKL6NChQ53nZ2RkcO+993LnnXcyf/58rr/+erZs2UJqair9+vUD4IYbbuCJJ57grrvuqvMalVJTU6vWJ6i8V4cOHaKoqKiqXeS6666rWqynSTzJkGPf94KNjSw2LUJCQgIHDx48YV9BQUHVugJQ93TMZ599NllZWfTq1YuJEyeSl5fHvHnzqhaMqW9K6ep8nbq6vmmrwVlec+XKlbRt27bWa75MNZ2ens6+ffv46quvyMzMZP78+WzevLnR8xqKtzLmI0eO+G/aak9nKMmDsuPQyuUpLYzPbJy3aRHi4uJISUmpWp2roKCA9957jzFjxjR4XuvWrenWrRsLFizgnHPOYezYsTzyyCNVM4PWN6V0dWPGjGHhwoVUVFSwf/9+nyb5qzl19KRJk/jb3/5WtV252ln1qafffffdWsmukojwwx/+kBtuuIGpU6cSHR3NgAED2LlzJ9nZ2QC8+OKLnHvuuXW+f2M6dOiAx+Phs88+A6jVOO2zyi6kxTaWIJhYicC0GC+88AK33XYbP//5zwF48MEH6d27d6PnjR07liVLlhATE8PYsWPJycmpSgSTJk2qc0rpTp06VZ3/gx/8gCVLljB48GD69evHyJEjG53SuebU0Y899hi33XYbQ4YMoaysjHHjxjFnzhwefPBBMjIyGD58OOeeey7du3ev95oZGRnMnj2bWbOcqrno6Gj+8Y9/cPXVV1c1Fs+Y4TTPTZ8+nSlTppCSklJnO0FdnnnmGX7yk58QGxvL+PHjT27aao83ETx/CbSKbvr54W7Yj2D07X6/rE1D3YLZNNTNp3Lq6vz8fEaMGMGKFStITk4OdFh+Vfk7AsyaNYt9+/bx17/+tdZxDf4tHCtyepUdb9pCOsZrwMUw5IcndapNQ22Myy6++GIOHTrE8ePH+fWvf93ikgDAv//9b37/+99TVlZGjx49eO6555p+kTYeuPwJv8dmTo0lAmP8oDlKXYF2zTXXcM011wQ6DOMCayw2fhNq1YzG/+xvIDRZIjB+ER0dTX5+vn0QhDFVJT8/n+hoawQONVY1ZPyia9eu5OTkYLPDhrfo6Gi6drXZ5EONJQLjF1FRUScM3jLGhA6rGjLGmDBnicAYY8KcJQJjjAlzITeyWERygW/qeTkRqL0cVHAKpVghtOINpVghtOINpVghtOJ1O9YeqlrnEo8hlwgaIiJZ9Q2hDjahFCuEVryhFCuEVryhFCuEVryBjNWqhowxJsxZIjDGmDDX0hLB3EAH0AShFCuEVryhFCuEVryhFCuEVrwBi7VFtREYY4xpupZWIjDGGNNElgiMMSbMtYhEICKTRWSLiGSLyP2BjqcxIrJTRNaJyBoRCarl1kTkWRE5ICLrq+3rKCIfisg2788OgYyxunri/Y2I7PHe3zUiMjWQMVYSkW4islRENonIBhG507s/6O5vA7EG672NFpHPReQrb7wPefcH472tL9aA3duQbyMQkUhgKzARyAFWARmqujGggTVARHYCaaoadANdRGQcUAy8oKqDvfv+CBSo6ixvou2gqvcFMs5K9cT7G6BYVR8JZGw1iUgKkKKqX4iIB1gNXA5MI8jubwOx/pDgvLcCxKpqsYhEAcuBO4ErCb57W1+skwnQvW0JJYIRQLaq7lDV48B84LIAxxSyVHUZUFBj92XA897nz+N8IASFeuINSqq6T1W/8D4vAjYBXQjC+9tArEFJHcXezSjvQwnOe1tfrAHTEhJBF2B3te0cgvgP1kuBD0RktYhMD3QwPjhNVfeB8wEBdApwPL64XUTWequOAl4dUJOI9ASGAf8hyO9vjVghSO+tiESKyBrgAPChqgbtva0nVgjQvW0JiUDq2Bfs9V3pqjocmALc5q3eMP7zFNAbGArsA/4U0GhqEJE4YCFwl6oWBjqehtQRa9DeW1UtV9WhQFdghIgMDnBI9aon1oDd25aQCHKAbtW2uwJ7AxSLT1R1r/fnAeB1nOqtYLbfW2dcWXd8IMDxNEhV93v/o1UA8wii++utE14IvKyqr3l3B+X9rSvWYL63lVT1EPAxTp17UN7bStVjDeS9bQmJYBXQV0RSRaQ1cC3wVoBjqpeIxHob3xCRWGASsL7hswLuLeAG7/MbgDcDGEujKv/je11BkNxfbyPhM8AmVf1ztZeC7v7WF2sQ39skEWnvfd4WuADYTHDe2zpjDeS9DfleQwDeblaPApHAs6r6cGAjqp+I9MIpBYCzVOj/C6Z4ReSfwHicKXH3Aw8CbwALgO7ALuBqVQ2KBtp64h2PU7xWYCdwS2U9cSCJyBjgU2AdUOHd/Uucuvegur8NxJpBcN7bITiNwZE4X3AXqOpvRSSB4Lu39cX6IgG6ty0iERhjjDl5LaFqyBhjzCmwRGCMMWHOEoExxoQ5SwTGGBPmLBEYY0yYs0RgzCkSkeJqz6d6Z7rsHsiYjGmKVoEOwJiWQkTOBx4HJqnqrkDHY4yvLBEY4wciMhZnWoCpqro90PEY0xQ2oMyYUyQipUARMF5V1wY6HmOaytoIjDl1pUAm8ONAB2LMybBEYMypq8BZuetsEflloIMxpqmsjcAYP1DVEhG5GPhURPar6jOBjskYX1kiMMZPVLVARCYDy0QkT1UDPuWxMb6wxmJjjAlz1kZgjDFhzhKBMcaEOUsExhgT5iwRGGNMmLNEYIwxYc4SgTHGhDlLBMYYE+b+P5v/ETF4rIT9AAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "k_values = range(1, y_train.size)\n",
    "weighted_voting_bools = [True, False]\n",
    "f_scores = [[], []]\n",
    "\n",
    "params_to_f = {}\n",
    "for k in k_values:\n",
    "    for i, weighted_voting in enumerate(weighted_voting_bools):\n",
    "        y_pred = np.array([predict(i, K=k, \n",
    "                           weighted_voting=weighted_voting) \n",
    "                        for i in range(y_test.size)])\n",
    "        f_measure = f1_score(y_pred, y_test, average='macro')\n",
    "        f_scores[i].append(f_measure)\n",
    "        params_to_f[(k, weighted_voting)] = f_measure\n",
    "\n",
    "(best_k, best_weighted), best_f = max(params_to_f.items(), \n",
    "                                      key=lambda x: x[1])\n",
    "plt.plot(k_values, f_scores[0], label='Weighted Voting')\n",
    "plt.plot(k_values, f_scores[1], label='Unweighted Voting')\n",
    "plt.axvline(best_k, c='k')\n",
    "plt.xlabel('K')\n",
    "plt.ylabel('F-measure')\n",
    "plt.legend()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We've optimized performance exhaustively iterating over all the possible input parameters. This exhaustive approach is called a **parameter sweep**, or a **grid search**.\n",
    "\n",
    "## 20.4 Running Grid Search Using Scikit-Learn\n",
    "\n",
    "Scikit-Learn has a built-in model for running KNN classification. In order to utilize this model, we must import the `KNeighborsClassifier` class.\n",
    "\n",
    "**Listing 20. 34. Importing Scikit-Learn’s KNN class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Initializing the class will create a KNN classifier object. Per common convention, we'll store this object in a `clf` variable.\n",
    "\n",
    "**Listing 20. 35. Initializing Scikit-Learn’s KNN classifier**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf = KNeighborsClassifier()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The initialized `clf` object has preset specifications for K and weighted voting.\n",
    "\n",
    "**Listing 20. 36. Printing the preset KNN parameters**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "K is set to 5.\n",
      "Weighted voting is set to 'uniform'.\n"
     ]
    }
   ],
   "source": [
    "K = clf.n_neighbors\n",
    "weighted_voting = clf.weights\n",
    "print(f\"K is set to {K}.\")\n",
    "print(f\"Weighted voting is set to '{weighted_voting}'.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can reinitialize `clf` with custom parameters.\n",
    "\n",
    "**Listing 20. 37. Setting Scikit-Learn’s KNN parameters**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf = KNeighborsClassifier(n_neighbors=4, weights='distance')\n",
    "assert clf.n_neighbors == 4\n",
    "assert clf.weights == 'distance'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we'll want to train our KNN model. Any Scikit-Learn `clf` classifier can be trained using the `fit` method.\n",
    "\n",
    "**Listing 20. 38. Training Scikit-Learn’s KNN classifier**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "KNeighborsClassifier(n_neighbors=4, weights='distance')"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After training, `clf` can predict the classes of any input `X_test` matrix (whose dimensions match `X_train`). Predictions are carried out with the `clf.predict` method.\n",
    "\n",
    "**Listing 20. 39. Predicting classes with a trained KNN classifier**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The predicted classes are:\n",
      "[2 1 0 2 0 2 0 1 1 1 2 1 1 1 1 0 1 1 0 0 2 1 0 0 2 0 0 1 1 0 2 1 0 2 2 1 0\n",
      " 2 1 1 2 0 2 0 0 1 2 2 1 2 1 2 1 1 1 1 1 1 1 2 1 0 2 1 1 1 1 2 0 0 2 1 0 0\n",
      " 1 0 2 1 0 1 2 1 0 2 2 2 2 0 0 2 2 0 2 0 2 2 0 0 2 0 0 0 1 2 2 0 0 0 1 1 0\n",
      " 0 1]\n",
      "\n",
      "The f-measure equals 0.95.\n"
     ]
    }
   ],
   "source": [
    "y_pred = clf.predict(X_test)\n",
    "f_measure = f1_score(y_pred, y_test, average='macro')\n",
    "print(f\"The predicted classes are:\\n{y_pred}\")\n",
    "print(f\"\\nThe f-measure equals {f_measure:.2f}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Additionally, `clf` allows to extract more nuanced prediction output. For instance, we can generate the fraction of the votes received by class for an inputted sample in `X_test`.\n",
    "\n",
    "**Listing 20. 40. Outputting vote ratios for each class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0.         0.21419074 0.78580926]\n",
      " [0.         1.         0.        ]\n",
      " [1.         0.         0.        ]\n",
      " [0.         0.         1.        ]]\n"
     ]
    }
   ],
   "source": [
    "vote_ratios = clf.predict_proba(X_test)\n",
    "print(vote_ratios[:4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, lets turn our attention to running grid search across `KNeighborsClassifier`. First we'll need to specify a dictionary mapping between our hyperparameters and their value-ranges.\n",
    "\n",
    "**Listing 20. 41. Defining a hyperparameter dictionary**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "hyperparams = {'n_neighbors': range(1, 40),\n",
    "              'weights': ['uniform', 'distance']}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we'll need to import Scikit-Learn's `GridSearchCV` class, which we'll use to execute the grid search.\n",
    "\n",
    "**Listing 20. 42. Importing Scikit-Learn’s grid search class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import GridSearchCV"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we'll initialize the `GridSearchCV` class by passing the hyperparameter dictionary.\n",
    "\n",
    "**Listing 20. 43. Initializing Scikit-Learn’s grid search class**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf_grid = GridSearchCV(KNeighborsClassifier(), hyperparams, \n",
    "                        scoring='f1_macro')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Running `clf_grid.fit(X, y)` will carry out the grid search.\n",
    "\n",
    "**Listing 20. 44. Running a grid search using Scikit-Learn**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GridSearchCV(estimator=KNeighborsClassifier(),\n",
       "             param_grid={'n_neighbors': range(1, 40),\n",
       "                         'weights': ['uniform', 'distance']},\n",
       "             scoring='f1_macro')"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf_grid.fit(X, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We've executed the grid-search. The optimized hyperparameters are stored within the `clf_grid.best_params_` attribute. Likewise, the f-measure associated with these parameters is stored within `clf_grid.best_score_`.\n",
    "\n",
    "**Listing 20. 45. Checking the optimized grid search results**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "A maximum f-measure of 0.99 is achieved with the following hyperparameters:\n",
      "{'n_neighbors': 10, 'weights': 'distance'}\n"
     ]
    }
   ],
   "source": [
    "best_f = clf_grid.best_score_\n",
    "best_params = clf_grid.best_params_\n",
    "print(f\"A maximum f-measure of {best_f:.2f} is achieved with the \"\n",
    "      f\"following hyperparameters:\\n{best_params}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The optimized KNN classifier is stored within the `clf_grid.best_estimator_` attribute.\n",
    "\n",
    "**Listing 20. 46. Accessing the optimized classifier**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf_best = clf_grid.best_estimator_\n",
    "assert clf_best.n_neighbors == best_params['n_neighbors']\n",
    "assert clf_best.weights == best_params['weights']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By leveraging `clf_best`, we can carry out predictions on new data. Alternatively, we can carry-out predictions directly with our optimized `clf_grid` object, by running `clf_grid.predict`.\n",
    "\n",
    "**Listing 20. 47. Generating predictions with `clf_grid`**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 20.5 Limitations of the KNN Algorithm\n",
    "\n",
    "he biggest problem with KNN is its speed. The algorithm can be very slow to run when the training set is large.\n",
    "We'll illustrate this slow-down by increasing elements within our training set `(X, y)` by 2000-fold. Afterwards, we'll time the grid-search process for the expanded data. **The code will take approximately 15 mintues to run. Hence, it has been commented out.**\n",
    "\n",
    "**Listing 20. 48. Optimizing KNN on a large training set**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "' \\nimport time\\nX_large = np.vstack([X for _ in range(2000)])\\ny_large = np.hstack([y for _ in range(2000)])\\nclf_grid = GridSearchCV(KNeighborsClassifier(), hyperparams, \\n                        scoring=\\'f1_macro\\')\\nstart_time = time.time()\\nclf_grid.fit(X_large, y_large)\\nrunning_time = (time.time() - start_time) / 60\\nprint(f\"The grid search took {running_time:.2f} minutes to run.\")\\n'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\"\"\" \n",
    "import time\n",
    "X_large = np.vstack([X for _ in range(2000)])\n",
    "y_large = np.hstack([y for _ in range(2000)])\n",
    "clf_grid = GridSearchCV(KNeighborsClassifier(), hyperparams, \n",
    "                        scoring='f1_macro')\n",
    "start_time = time.time()\n",
    "clf_grid.fit(X_large, y_large)\n",
    "running_time = (time.time() - start_time) / 60\n",
    "print(f\"The grid search took {running_time:.2f} minutes to run.\")\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
